Analysing Ultrasound Tongue Imaging data using (M)FPCA and GAMM

Author
Affiliation

Nathalie Elsässer

Acoustics Research Institute,
Austrian Academy of Sciences, Vienna

Published

August 26, 2025

This script was written using the help of three great tutorials – Gubian (2025); Coretta and Sakr (2025); Wieling (2018).

Code
# data processing and plotting
library(tidyverse)
library(plotly)
library(rticulate)
# GAMs
library(mgcv)
library(itsadug)
# FPCA
library(funData)
library(MFPCA)
# install.packages("devtools")
# devtools::install_github("uasolo/landmarkregUtils")
library(landmarkregUtils)
library(abind)
# LMER
library(lme4)
library(emmeans)
# plotting
library(RColorBrewer)
library(gridExtra)
source("polar_utils.R")

set seed:

Code
set.seed(207)

1 static data

In this first part, we are using the static data that we created in the \(data\_preparation\) steps.

Read in the Cartesian static data set:

Code
rhotic_data_static <- readRDS("../data/RDS/rhotic_data_static.rds")

The important parts of the data set for this part of the script are:

  • \(participant\): The data set contains the data of 8 participants. They are abbreviated by “vp” (Versuchsperson) plus their participant number. The participants included have the numbers 4, 5, 6, 8, 9, 10, 20 and 23
  • \(target\): The target words that were read by the participants
  • \(knot\): Each AAA tongue spline is defined by 11 knots. 0 is the knot closest to the tongue root, 10 is the knot closest to the tongue tip.
  • \(frame\_id\): All knots on one AAA tongue spline share the same \(frame\_id\) to indicate that the rows in the table belong together.
  • \(x\_mean\): The mean x-value of the target-/r/ of a participant
  • \(y\_mean\): The mean y-value of the target-/r/ of a participant
  • \(x\_mean\_z\): The mean x-value of the target-/r/ of a participant - z-scored
  • \(y\_mean\_z\): The mean y-value of the target-/r/ of a participant - z-scored

The goal of this script is to find out if there are differences in the production of the rhotic between the different speakers – regardless of syllable position, phonetic environment or other factors.

1.1 GAMM

GAMM is a generalized additive mixed model, for a general tutorial see Wieling (2018).

1.1.1 Polar coordinates

Polar coordinates use a radius and an angle to define coordinates. The angle is a predictor variable used to predict the radius (radius = f(angle)).

Load the data:

Code
rhotic_data_static_polar <- readRDS("../data/RDS/rhotic_data_polar_static.rds")

The data set has the same structure as the Cartesian data set described above but instead of x_mean- and y_mean-values, it has angle- and radius-values. [Note: We will call this data the “original”/OG Polar data (in comparison to the Linear/Procrustean normalized data), but in reality, this data is not the “original”/raw data that we got from AAA, but data that was first z-scored as Cartesian coordinates and then (after the z-scoring) converted to Polar coordinates.]

We can plot the Polar data in a Cartesian coordinate system. Here, each radius value corresponds to exactly one angle value, whereas in the Cartesian data one x-value could have multiple y-values (because the tongue can bend back on itself).

If we plot the Polar data in a Cartesian coordinate system with angle along the x-axis and radius along the y-axis, it looks like this:

Code
# plot raw Polar 
ggplot(rhotic_data_static_polar) +
  aes(angle, radius, group = frame_id) +
  geom_path() +
  facet_wrap(vars(participant), ncol = 4)
Figure 1: Static polar data plotted in Cartesian coordinate system

Note that the plots here are mirrored in comparison to what we’re normally used to when looking at tongue contours - the tongue tip is pictured along the left side (small angle values) while the tongue root is along the right side.

In a Polar coordinate system, the data looks like this:

Code
ggplot(rhotic_data_static_polar) +
    aes(angle, radius, group = frame_id) +
    geom_path() +
    radial_plot(5) +
  facet_wrap(vars(participant), ncol = 4)
Figure 2: Static data in Polar coordinate system

(Here the tongue tip is pictures on the right side, tongue root on the left side.)

As input formula for the GAM, we take the basic formula radius ~ angle – angle is the independent variable while radius is the dependent variable. We aim to predict the radius based on two predictor variables: angle and participant.

Code
polar_gam_static <- bam(radius ~ participant + s(angle, by = factor(participant)),
                          discrete = TRUE,
                        family = 'scat',
                        data = rhotic_data_static_polar
)
AR1.rho <- acf_resid(polar_gam_static)[2]

Taking autocorrelation into account (see Wieling (2018)):

Code
polar_gam_static <- bam(radius ~ participant + s(angle, by = factor(participant)),
                          discrete = TRUE,
                        family = 'scat',
                        rho = AR1.rho,
                        AR.start = rhotic_data_static_polar %>% pull(knot) == 0,
                        data = rhotic_data_static_polar
)
summary(polar_gam_static)

Family: Scaled t(3.702,0.13) 
Link function: identity 

Formula:
radius ~ participant + s(angle, by = factor(participant))

Parametric coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)       2.748542   0.007569 363.150  < 2e-16 ***
participantvp005 -0.028711   0.009270  -3.097  0.00196 ** 
participantvp006  0.157684   0.060205   2.619  0.00882 ** 
participantvp008  0.508269   0.181831   2.795  0.00519 ** 
participantvp009  0.070077   0.011181   6.268 3.73e-10 ***
participantvp010 -0.029571   0.010347  -2.858  0.00427 ** 
participantvp020 -0.051009   0.010790  -4.728 2.28e-06 ***
participantvp023  0.028539   0.009251   3.085  0.00204 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                                    edf Ref.df    F p-value    
s(angle):factor(participant)vp004 8.764  8.977 3173  <2e-16 ***
s(angle):factor(participant)vp005 8.942  8.999 7851  <2e-16 ***
s(angle):factor(participant)vp006 8.542  8.800 7750  <2e-16 ***
s(angle):factor(participant)vp008 8.014  8.104 7069  <2e-16 ***
s(angle):factor(participant)vp009 8.683  8.937 3179  <2e-16 ***
s(angle):factor(participant)vp010 8.741  8.954 7171  <2e-16 ***
s(angle):factor(participant)vp020 8.886  8.990 7793  <2e-16 ***
s(angle):factor(participant)vp023 8.953  8.999 7261  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.955   Deviance explained = 79.6%
fREML =  33545  Scale est. = 1         n = 23870
Code
gam.check(polar_gam_static)


Method: fREML   Optimizer: perf chol
$grad
[1] -1.559641e-12 -6.883383e-13 -4.027080e-09  2.638308e-07  7.893242e-12
[6]  1.301936e-11  1.325002e-10 -8.149037e-13

$hess
              [,1]          [,2]          [,3]          [,4]          [,5]
[1,]  3.918426e+00  5.358695e-19 -1.953158e-17  2.896507e-17 -7.998318e-20
[2,]  4.790273e-32  3.978714e+00  2.153581e-30 -1.369402e-31  4.304696e-34
[3,] -1.412954e-16  3.966717e-30  1.434901e+00  1.973094e-29  2.185871e-32
[4,] -2.621822e-16  7.771291e-31 -1.111963e-27  2.442363e+00  4.100810e-31
[5,]  8.217081e-19 -1.214714e-32 -7.362142e-30 -1.252619e-30  3.699677e+00
[6,] -1.179878e-18 -7.500947e-32  1.818211e-28 -2.235582e-30  2.601054e-32
[7,] -1.009545e-17 -3.480787e-31 -1.074289e-28 -3.267773e-30  1.693391e-32
[8,] -3.098512e-19  3.204277e-33 -3.437788e-30  2.525972e-31  4.918788e-34
              [,6]          [,7]          [,8]
[1,]  4.602215e-18  5.498904e-18 -2.567126e-19
[2,]  1.866329e-31 -1.866329e-31 -6.087303e-34
[3,]  8.352180e-29 -8.800584e-29 -1.009674e-30
[4,] -6.244487e-29 -1.664617e-29 -1.030394e-30
[5,]  2.111453e-32  1.056812e-31 -1.418499e-32
[6,]  3.786879e+00 -3.799774e-30 -1.382843e-32
[7,] -5.307519e-30  3.790286e+00  2.878063e-32
[8,]  3.604982e-32  8.940799e-32  3.961315e+00

Model rank =  80 / 80 

Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.

                                    k'  edf k-index p-value    
s(angle):factor(participant)vp004 9.00 8.76    0.96  <2e-16 ***
s(angle):factor(participant)vp005 9.00 8.94    0.96  <2e-16 ***
s(angle):factor(participant)vp006 9.00 8.54    0.96  <2e-16 ***
s(angle):factor(participant)vp008 9.00 8.01    0.96  <2e-16 ***
s(angle):factor(participant)vp009 9.00 8.68    0.96  <2e-16 ***
s(angle):factor(participant)vp010 9.00 8.74    0.96  <2e-16 ***
s(angle):factor(participant)vp020 9.00 8.89    0.96  <2e-16 ***
s(angle):factor(participant)vp023 9.00 8.95    0.96  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
acf_resid(polar_gam_static)

Now that we’ve calculated the GAM, we get the predictions and plot them:

Code
angle_grid_plot <- seq(min(rhotic_data_static_polar$angle), max(rhotic_data_static_polar$angle),
                       length.out = 20) # make a prediction plot with 20 values
static_polar_pred <- get_predictions(polar_gam_static,
                        cond = list(angle = angle_grid_plot, 
                                    participant = factor(rhotic_data_static_polar$participant) %>% levels()),
                        print.summary = FALSE)

gam_polar_plot <- ggplot(static_polar_pred) +
  aes(x = angle, y = fit, group = participant ) +
  geom_line(mapping = aes(color = participant)) +
  geom_ribbon(mapping = aes(ymin = fit - CI, ymax = fit + CI, fill = participant), alpha = 0.5) +
  radial_plot(10) +
  theme(legend.position = "bottom")

Plot the predictions:

Code
gam_polar_plot
Figure 3: GAM predicted contours

The predicted contours for the different participants appear to match the data well when compared to the original contours. Participant 8 shows the highest tongue tip while participant 9 shows the downwardly curved tongue tip. Also, there is variation in the area around the tongue dorsum – while participant 8 has a very flat tongue dorsum, most other partcipants show a raised tongue dorsum.

However, it appearsto be an issue with the very left side of the data, specifically the values at the tongue root. This is particularly noticeable for participants 6 and 8. There may not be enough data points to accurately calculate tongue contours at angle values of 7/8\(\pi\) and higher. A similar problem occured in Gubian (2025) (UTI_analysis_code.html#sec-st_angle_GAM). Gubian (2025) suggests doing a Linear angle normalization and a Procrustean fit (“we establish a fixed angle range for all contours, cut off the points that are beyond that range and extrapolate points for contours that do not have values up to the range limits”, Gubian (2025)). We will apply both methods on this data set to determine their impact and assess if the predicted values at the tongue root improve.

1.1.1.1 Linear Angle Normalization

We will now do a linear angle normalization similar to what Gubian (2025) describes:

“A simple way to establish the reference angle range is to take the median angle of the first/last point from all contours. The contours are then reinterpolated on a regular grid within the common angle range. This of course introduces approximations, including the fact that now knots lose their original interpretation.” (Gubian 2025)

Code
angle_range <- rhotic_data_static_polar  %>% 
  filter(knot %in% range(knot)) %>% # first and last knot
  group_by(knot) %>% 
  summarise(median_angle = median(angle)) %>% # median angle for first and last knot (all participants)
  pull(median_angle) %>% 
  sort()

angle_grid <- seq(angle_range[1], angle_range[2], length.out=11) # 11 equidistant points between the median of knot0 & knot10

linear_norm_static_polar_curves  <- rhotic_data_static_polar  %>% 
  group_by(participant, frame_id) %>% 
  # linear angle normalization on the angle_range interval
  mutate(angle = (angle - min(angle)) * diff(angle_range) / diff(range(angle)) + angle_range[1]) %>% 
  reframe(bind_cols(
    knot = rev(seq(along.with = angle_grid, from = 0)), # fake knots, needed for AR1 term in GAM; knots have no meaning
    approx(angle, radius, angle_grid) %>% as_tibble() # returns list of points which linearly interpolate given data points
    )) %>% 
  rename(angle = x, radius = y)

I noticed that this produces a few NA values at knot 0, e.g.:

Code
ex_og <- rhotic_data_static_polar %>% filter(participant == "vp004" & frame_id == 4) %>% 
  ggplot() +
  aes(angle, radius, group = frame_id) +
    geom_path() +
  geom_point() +
  geom_text(aes(label = knot), check_overlap = TRUE,
            nudge_x = .05, nudge_y = .05, show.legend = FALSE) +
    ggtitle("OG data; participant 4; \"bankrott\"") +
    radial_plot(5) +
    theme(legend.position = "bottom")
ex_linear <- linear_norm_static_polar_curves %>% filter(participant == "vp004" & frame_id == 4) %>% 
  ggplot() +
    aes(angle, radius, group = frame_id) +
    geom_path() +
  geom_point() +
  geom_text(aes(label = knot), check_overlap = TRUE,
            nudge_x = .05, nudge_y = .05, show.legend = FALSE) +
    ggtitle("linear norm data; participant 4; \"bankrott\"\nKnot 0 is missing") +
    radial_plot(5) +
    theme(legend.position = "bottom")
Code
grid.arrange(ex_og, ex_linear, ncol = 2)
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_path()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_point()`).
Warning: Removed 1 row containing missing values or values outside the scale range
(`geom_text()`).
Figure 4: Comparison original data and linear data on one example (missing knot 0)

We will ignore this for now (because I don’t know what to do about it :D)

Comparison of the old data set and the linear angle normalized data set:

Code
polar_plot_z_per_part <- ggplot(rhotic_data_static_polar) +
    aes(angle, radius, group = frame_id, color = participant) +
    geom_path() +
    ggtitle("OG data") +
    radial_plot(5) +
    theme(legend.position = "bottom") + facet_wrap(vars(participant), ncol = 3)
polar_plot_linear_norm <- ggplot(linear_norm_static_polar_curves) +
    aes(angle, radius, group = frame_id, color = participant) +
    geom_path() +
    ggtitle("linear norm data") +
    radial_plot(5) +
    theme(legend.position = "bottom") + facet_wrap(vars(participant), ncol = 3)
Code
grid.arrange(polar_plot_z_per_part, polar_plot_linear_norm, ncol = 2)
Figure 5: Comparison original data and linear data; all contours

The new tongue contours look different but overall keep their general shape. The ones with an upward-pointing tongue tip (e.g., participant 8) might look implausible. We have to keep that in mind when interpreting the following results.

We will now calculate a new GAM:

Code
GAM_linear_norm <- bam(radius ~ participant + s(angle, by = factor(participant)),
            discrete = TRUE,
            family = 'scat',
            data = linear_norm_static_polar_curves
)
AR1.rho <- acf_resid(GAM_linear_norm)[2]

Taking autocorrelation into account:

Code
GAM_linear_norm <- bam(radius ~ participant + s(angle, by = factor(participant)),
                          discrete = TRUE,
                        family = 'scat',
                        rho = AR1.rho,
                        AR.start = linear_norm_static_polar_curves %>% pull(knot) == 0,
                        data = linear_norm_static_polar_curves
)
summary(GAM_linear_norm)

Family: Scaled t(3.876,0.135) 
Link function: identity 

Formula:
radius ~ participant + s(angle, by = factor(participant))

Parametric coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)       2.882967   0.009064 318.071  < 2e-16 ***
participantvp005  0.071869   0.011053   6.502 8.08e-11 ***
participantvp006  0.208073   0.011049  18.831  < 2e-16 ***
participantvp008  0.292427   0.011165  26.191  < 2e-16 ***
participantvp009 -0.031647   0.012844  -2.464   0.0137 *  
participantvp010  0.076997   0.011049   6.968 3.29e-12 ***
participantvp020  0.177837   0.011027  16.128  < 2e-16 ***
participantvp023  0.025222   0.011048   2.283   0.0224 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                                    edf Ref.df     F p-value    
s(angle):factor(participant)vp004 8.882  8.997  8283  <2e-16 ***
s(angle):factor(participant)vp005 8.966  9.000 20560  <2e-16 ***
s(angle):factor(participant)vp006 8.926  8.999 19767  <2e-16 ***
s(angle):factor(participant)vp008 8.972  9.000 27329  <2e-16 ***
s(angle):factor(participant)vp009 8.941  8.999  5651  <2e-16 ***
s(angle):factor(participant)vp010 8.931  8.999 25201  <2e-16 ***
s(angle):factor(participant)vp020 8.933  8.999 21031  <2e-16 ***
s(angle):factor(participant)vp023 8.966  9.000 22654  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.959   Deviance explained = 83.4%
fREML =  28865  Scale est. = 1         n = 23846
Code
gam.check(GAM_linear_norm)


Method: fREML   Optimizer: perf chol
$grad
[1]  4.254819e-12 -3.699263e-13  1.986411e-11 -3.561595e-13 -3.810285e-12
[6]  5.647927e-12  4.347633e-13  5.107026e-14

$hess
              [,1]          [,2]          [,3]          [,4]          [,5]
[1,]  3.977832e+00 -1.226022e-06 -2.271527e-08 -2.448353e-10 -2.267630e-13
[2,] -1.226022e-06  3.986952e+00 -4.109272e-06 -4.428962e-08 -4.102040e-11
[3,] -2.271527e-08 -4.109272e-06  3.978576e+00 -3.301253e-05 -3.057558e-08
[4,] -2.448353e-10 -4.428962e-08 -3.301253e-05  3.943860e+00 -1.480571e-05
[5,] -2.267629e-13 -4.102040e-11 -3.057558e-08 -1.480571e-05  3.990400e+00
[6,] -2.786845e-16 -5.036846e-14 -3.754338e-11 -1.817962e-08 -3.042165e-07
[7,] -6.046064e-18 -1.085212e-15 -8.088891e-13 -3.916879e-10 -6.544616e-09
[8,] -4.086867e-19  2.112466e-18  1.572912e-15  7.616500e-13  1.272620e-11
              [,6]          [,7]          [,8]
[1,] -2.784364e-16 -5.963404e-18 -1.859791e-19
[2,] -5.036846e-14 -1.085212e-15  2.111579e-18
[3,] -3.754338e-11 -8.088891e-13  1.572912e-15
[4,] -1.817962e-08 -3.916879e-10  7.616500e-13
[5,] -3.042165e-07 -6.544616e-09  1.272620e-11
[6,]  3.969119e+00 -1.917124e-07  3.721376e-10
[7,] -1.917124e-07  3.987991e+00  3.022846e-07
[8,]  3.721376e-10  3.022846e-07  3.982543e+00

Model rank =  80 / 80 

Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.

                                    k'  edf k-index p-value
s(angle):factor(participant)vp004 9.00 8.88       1    0.40
s(angle):factor(participant)vp005 9.00 8.97       1    0.46
s(angle):factor(participant)vp006 9.00 8.93       1    0.48
s(angle):factor(participant)vp008 9.00 8.97       1    0.43
s(angle):factor(participant)vp009 9.00 8.94       1    0.43
s(angle):factor(participant)vp010 9.00 8.93       1    0.41
s(angle):factor(participant)vp020 9.00 8.93       1    0.44
s(angle):factor(participant)vp023 9.00 8.97       1    0.46
Code
acf_resid(GAM_linear_norm)

Get the predictions and plot:

Code
angle_grid_plot <- angle_grid
pred <- get_predictions(GAM_linear_norm,
                        cond = list(angle = angle_grid_plot, 
                                    participant = factor(rhotic_data_static_polar$participant) %>% levels()),
                        print.summary = FALSE)

gam_polar_lin_plot <- ggplot(pred) +
  aes(x = angle, y = fit, group = participant) +
  geom_line(mapping = aes(color = participant)) +
  geom_ribbon(mapping = aes(ymin = fit - CI, ymax = fit + CI, fill = participant), alpha = 0.5) +
  radial_plot(5) +
  theme(legend.position = "bottom")
Code
gam_polar_lin_plot
Figure 6: GAM predicted contours (linear normalization)

The most important point first: the tongue root appears noticeably improved compared to the previous plot.

The tongue tips of participants 8, 10 and 23 are clearly raised in this prediction plot. In the prediction plot of the first Polar GAM, the tongue tip of participant 23 looked different and more pointed downward. Another (noteworthy) difference concerns the region between tongue dorsum and tongue root. In the new prediction plot, the participants differ a lot from each other in this region – e.g., participant 8’s tongue contour is more bulbous and goes further back than the contour of the other participants in that region, the contour of participant 5 is very steep, etc. In the first prediction plot, all of the participants had a very similar tongue shape in that region. It’s difficult to determine which plot better represents the actual data. The participants do show differences in the tongue root/tongue dorsum region, but it seems slightly over-exaggerated in the new plot using the Linear normalized data, e.g., participant 8’s contours do seem bulbous in that part but they do not seem to be that far back in comparison with the other participants.

One thing that stands out are the very small confidence interval ribbons in the new plot. The original data clearly shows a large spread for nearly all the participants in the tongue tip area, especially for participants 5 and 6. While the old plot showed a wider ribbon in this region, the new plot shows similarly narrow confidence intervals at the tongue tip as in any other region.

1.1.1.2 Procrustean angle range

We will now try an additional way of normalizing the data – the Procrustean fit. This is typically used to analyse shapes. In this case we use it to make all of the tongue contours comparable to the median tongue contour that we calculated above (median of knot 0, median of knot 10 and then interpolate for the “knots” inbetween) and saved in the angle_grid.

Code
proc_static_polar_curves <- rhotic_data_static_polar %>% 
  group_by(participant, frame_id) %>% 
  # interpolation using the angle_grid
  reframe(bind_cols(
    knot = rev(seq(along.with = angle_grid, from = 0)), # fake knots, needed for AR1 term in GAM
    approx(angle, radius, angle_grid, rule = 2) %>% as_tibble()
    )) %>% 
  rename(angle = x, radius = y)

Comparison of the original data set and the Procrustean fit data set:

Code
polar_plot_proc <- ggplot(proc_static_polar_curves) +
    aes(angle, radius, group = frame_id, color = participant) +
    geom_path() +
    ggtitle("Procrustean fit data") +
    radial_plot(5) +
    theme(legend.position = "bottom") + facet_wrap(vars(participant), ncol = 3)
Code
grid.arrange(polar_plot_z_per_part, polar_plot_proc, ncol = 2)

In comparison to the linear normalized, the Procrustean fit data looks more similar to the original data. Especially the contours of participant 8 (Linear normalized: their contours looked very flat, the upward-pointing tongue tip nearly got lost) and participant 10 (linear normalized: back part of the tongue contours looked bulbous and like there was a huge spread between the tongue contours that can’t be found in the original data).

The only thing that stands out as particularly strange is the behavior of some of participant 8’s tongue contours at the tongue root. Instead of extending straight downward, they appear to “jump around”:

Code
polar_plot_proc_ex_8 <- ggplot(proc_static_polar_curves %>% filter(participant == "vp008" & frame_id %in% c("1031", "1044", "944", "896"))) +
    aes(angle, radius, group = frame_id, color = participant) +
    geom_path() +
    ggtitle("examples of weird tongue roots of participant 8\nProcrustean fit data") +
    radial_plot(5) +
    theme(legend.position = "bottom") + facet_wrap(vars(frame_id), ncol = 2)
Code
polar_plot_proc_ex_8
Figure 7: Example: distortion of tongue contour of participant 8 through Procrustean normalization

Let’s keep this in mind when interpreting the results later.

Then calculate the new GAM:

Code
GAM_proc <- bam(radius ~ participant + s(angle, by = factor(participant)),
            discrete = TRUE,
            family = 'scat',
            data = proc_static_polar_curves
)
AR1.rho <- acf_resid(GAM_proc)[2]

Taking autocorrelation into account:

Code
GAM_proc <- bam(radius ~ participant + s(angle, by = factor(participant)),
                          discrete = TRUE,
                        family = 'scat',
                        rho = AR1.rho,
                        AR.start = proc_static_polar_curves %>% pull(knot) == 0,
                        data = proc_static_polar_curves
)
summary(GAM_proc)

Family: Scaled t(3.51,0.12) 
Link function: identity 

Formula:
radius ~ participant + s(angle, by = factor(participant))

Parametric coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)       3.032327   0.007127 425.460  < 2e-16 ***
participantvp005 -0.009478   0.008704  -1.089 0.276227    
participantvp006 -0.072592   0.008697  -8.347  < 2e-16 ***
participantvp008 -0.099494   0.008789 -11.321  < 2e-16 ***
participantvp009  0.047620   0.010042   4.742 2.13e-06 ***
participantvp010 -0.040968   0.008697  -4.711 2.48e-06 ***
participantvp020 -0.015875   0.008679  -1.829 0.067392 .  
participantvp023  0.029093   0.008689   3.348 0.000814 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                                    edf Ref.df     F p-value    
s(angle):factor(participant)vp004 8.819  8.993  6353  <2e-16 ***
s(angle):factor(participant)vp005 8.952  8.999 17635  <2e-16 ***
s(angle):factor(participant)vp006 8.978  9.000 20458  <2e-16 ***
s(angle):factor(participant)vp008 8.984  9.000 27336  <2e-16 ***
s(angle):factor(participant)vp009 8.892  8.997  4242  <2e-16 ***
s(angle):factor(participant)vp010 8.937  8.999 21728  <2e-16 ***
s(angle):factor(participant)vp020 8.956  9.000 19925  <2e-16 ***
s(angle):factor(participant)vp023 8.954  9.000 16323  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.965   Deviance explained = 83.6%
fREML =  32728  Scale est. = 1         n = 23870
Code
gam.check(GAM_proc)


Method: fREML   Optimizer: perf chol
$grad
[1]  2.615685e-12  2.643219e-12  2.470824e-11  3.197442e-12  8.557599e-13
[6] -3.685940e-14  2.181988e-11 -4.130030e-14

$hess
              [,1]          [,2]          [,3]          [,4]          [,5]
[1,]  3.944586e+00  3.722687e-07 -3.654450e-08 -1.096777e-10 -9.060599e-14
[2,]  3.722687e-07  3.985320e+00 -3.747623e-08 -1.124183e-10 -9.287095e-14
[3,] -3.654450e-08 -3.747623e-08  3.965248e+00 -2.043257e-05 -1.687975e-08
[4,] -1.096777e-10 -1.124183e-10 -2.043257e-05  3.971195e+00 -5.758790e-06
[5,] -9.060556e-14 -9.287095e-14 -1.687975e-08 -5.758790e-06  3.974144e+00
[6,] -1.615671e-15 -1.656177e-15 -3.010184e-10 -1.026960e-07 -8.527767e-06
[7,] -7.396208e-18 -5.598739e-18 -1.017700e-12 -3.472006e-10 -2.883040e-08
[8,] -1.196879e-19 -3.617475e-22 -8.800091e-17 -3.002255e-14 -2.492974e-12
              [,6]          [,7]          [,8]
[1,] -1.613918e-15 -7.645135e-18  9.681514e-20
[2,] -1.656178e-15 -5.598654e-18 -5.129787e-22
[3,] -3.010184e-10 -1.017700e-12 -8.800089e-17
[4,] -1.026960e-07 -3.472006e-10 -3.002255e-14
[5,] -8.527767e-06 -2.883040e-08 -2.492974e-12
[6,]  3.943557e+00 -1.957919e-05 -1.693015e-09
[7,] -1.957919e-05  3.956929e+00 -7.442742e-07
[8,] -1.693015e-09 -7.442742e-07  3.965409e+00

Model rank =  80 / 80 

Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.

                                    k'  edf k-index p-value
s(angle):factor(participant)vp004 9.00 8.82    0.99    0.18
s(angle):factor(participant)vp005 9.00 8.95    0.99    0.21
s(angle):factor(participant)vp006 9.00 8.98    0.99    0.15
s(angle):factor(participant)vp008 9.00 8.98    0.99    0.21
s(angle):factor(participant)vp009 9.00 8.89    0.99    0.21
s(angle):factor(participant)vp010 9.00 8.94    0.99    0.20
s(angle):factor(participant)vp020 9.00 8.96    0.99    0.16
s(angle):factor(participant)vp023 9.00 8.95    0.99    0.20
Code
acf_resid(GAM_proc)

Get the predictions and plot:

Code
pred <- get_predictions(GAM_proc,
                        cond = list(angle = angle_grid_plot, 
                                    participant = factor(rhotic_data_static_polar$participant) %>% levels()),
                        print.summary = FALSE)

gam_polar_proc_plot <- ggplot(pred) +
  aes(x = angle, y = fit, group = participant) +
  geom_line(mapping = aes(color = participant)) +
  geom_ribbon(mapping = aes(ymin = fit - CI, ymax = fit + CI, fill = participant), alpha = 0.5) +
  radial_plot(5) +
  theme(legend.position = "bottom")
Code
gam_polar_proc_plot
Figure 8: GAM predicted contours (Procrustean normalization)

The tongue dorsum and tongue root in this plot resemble those in the initial plot more closely than those in the one based on Linear normalization. Nonetheless, it does not have the weird values around the tongue root that we had in the first GAM, which is a positive improvement. As in the first GAM, the values between tongue root and tongue dorsum are very close together. Regarding the tongue dorsum, we observe participant-specific variation: participant 8 once more shows a flat contour while most of the others show a more raised tongue dorsum. The front part (tongue tip) looks very similar to the linear normalization.

1.1.2 Cartesian coordinates

We have the option to use

  1. univariate GAMs - GAMs with only one dependent variable (Gubian 2025)

  2. multivariate GAMs - GAMs with more than one dependent variable (Coretta and Sakr 2025)

For the Cartesian tongue contour data, we focus on modeling the x and y coordinates as outcome variables. By definition, these are two variables that we try to predict. So it would make sense to use a multivariate GAM. However, it is also possible to model the data using univariate GAMs. The advantage of this approach is its lower complexity. On the downside, it fails to account for the correlation between the x and y coordinates, which may limit the model’s accuracy. We therefore apply both univariate ad multivariate GAMs to assess their performance on our data.

1.1.2.1 Univariate trick from Michele Gubian

We will encode the dimension as a factor with two levels (x & y). The dimension will be seen as a predictor variable. Since the participant variable is a predictor variable as well, we will create a compound factor out of these two to have only one predictor. The response variable will be the tongue position.

Code
# create compound factor participant.dim for all combinations of participant and dimension
rhotic_data_static_long <- rhotic_data_static %>% 
             pivot_longer(c(x_mean_z, y_mean_z), names_to = 'dimension', values_to = 'position') %>% 
             unite("participant.dim", c(participant, dimension), sep = ".") %>% 
             mutate(participant.dim = factor(participant.dim)) %>% 
             # order to capture AR along knots
             arrange(frame_id, participant.dim, knot)

# calculate GAM
GAM <- bam(position ~ participant.dim + s(knot, by = participant.dim),
           data = rhotic_data_static_long,
           family = 'scat',
           discrete = TRUE)

# autocorellation:
AR1.rho <- acf_resid(GAM)[2]

Model that takes autocorrelation in the residuals into account:

Code
GAM <- bam(position ~ participant.dim + s(knot, by = participant.dim),
           data = rhotic_data_static_long,
           rho = AR1.rho, # error model for residuals
            AR.start = rhotic_data_static_long %>% pull(knot) == 0,
           family = 'scat',
           discrete = TRUE)
summary(GAM)

Family: Scaled t(5.925,0.105) 
Link function: identity 

Formula:
position ~ participant.dim + s(knot, by = participant.dim)

Parametric coefficients:
                                Estimate Std. Error t value Pr(>|t|)    
(Intercept)                   -0.0018907  0.0070307  -0.269 0.787998    
participant.dimvp004.y_mean_z -0.0257281  0.0099429  -2.588 0.009668 ** 
participant.dimvp005.x_mean_z  0.0068721  0.0085923   0.800 0.423836    
participant.dimvp005.y_mean_z -0.0159515  0.0085923  -1.856 0.063391 .  
participant.dimvp006.x_mean_z -0.0053684  0.0085787  -0.626 0.531458    
participant.dimvp006.y_mean_z -0.0298260  0.0085787  -3.477 0.000508 ***
participant.dimvp008.x_mean_z  0.0072284  0.0086692   0.834 0.404396    
participant.dimvp008.y_mean_z -0.0077004  0.0086692  -0.888 0.374414    
participant.dimvp009.x_mean_z -0.0002993  0.0099108  -0.030 0.975907    
participant.dimvp009.y_mean_z -0.0338593  0.0099108  -3.416 0.000635 ***
participant.dimvp010.x_mean_z -0.0046124  0.0085787  -0.538 0.590816    
participant.dimvp010.y_mean_z -0.0169423  0.0085787  -1.975 0.048283 *  
participant.dimvp020.x_mean_z  0.0036264  0.0085609   0.424 0.671861    
participant.dimvp020.y_mean_z -0.0161501  0.0085609  -1.886 0.059235 .  
participant.dimvp023.x_mean_z -0.0052550  0.0085698  -0.613 0.539742    
participant.dimvp023.y_mean_z -0.0124969  0.0085698  -1.458 0.144778    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                                        edf Ref.df     F p-value    
s(knot):participant.dimvp004.x_mean_z 8.957  9.000  6095  <2e-16 ***
s(knot):participant.dimvp004.y_mean_z 8.967  9.000 10842  <2e-16 ***
s(knot):participant.dimvp005.x_mean_z 8.986  9.000 13068  <2e-16 ***
s(knot):participant.dimvp005.y_mean_z 8.990  9.000 23979  <2e-16 ***
s(knot):participant.dimvp006.x_mean_z 8.989  9.000 12994  <2e-16 ***
s(knot):participant.dimvp006.y_mean_z 8.988  9.000 21308  <2e-16 ***
s(knot):participant.dimvp008.x_mean_z 8.985  9.000 12878  <2e-16 ***
s(knot):participant.dimvp008.y_mean_z 8.962  9.000 15641  <2e-16 ***
s(knot):participant.dimvp009.x_mean_z 8.945  8.999  6118  <2e-16 ***
s(knot):participant.dimvp009.y_mean_z 8.981  9.000 13314  <2e-16 ***
s(knot):participant.dimvp010.x_mean_z 8.984  9.000 12879  <2e-16 ***
s(knot):participant.dimvp010.y_mean_z 8.988  9.000 19914  <2e-16 ***
s(knot):participant.dimvp020.x_mean_z 8.989  9.000 13593  <2e-16 ***
s(knot):participant.dimvp020.y_mean_z 8.991  9.000 22987  <2e-16 ***
s(knot):participant.dimvp023.x_mean_z 8.986  9.000 12835  <2e-16 ***
s(knot):participant.dimvp023.y_mean_z 8.993  9.000 23514  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.983   Deviance explained = 91.1%
fREML =  51116  Scale est. = 1         n = 47740

Check the GAM and the distribution of residuals:

Code
gam.check(GAM)


Method: fREML   Optimizer: perf chol
$grad
 [1]  6.661338e-15  3.406608e-12 -1.776357e-15  1.643130e-14  9.281464e-14
 [6]  4.978240e-13 -3.552714e-15 -4.707346e-14 -2.451372e-13  2.793321e-13
[11]  4.440892e-16 -1.509903e-14 -3.019807e-14  6.794565e-14  2.042810e-14
[16]  4.884981e-15

$hess
               [,1]          [,2]          [,3]          [,4]          [,5]
 [1,]  3.986825e+00 -3.379901e-20 -1.088831e-20 -2.298015e-21 -8.130072e-21
 [2,] -4.287774e-21  3.993054e+00  6.418114e-36  3.209057e-36  1.825693e-36
 [3,] -4.814880e-35  3.122366e-35  3.995869e+00  9.590812e-36  5.554723e-36
 [4,] -3.034181e-21  2.347843e-35  9.372255e-36  3.997995e+00  5.136816e-36
 [5,]  2.256949e-36 -4.606377e-36 -6.954616e-36 -5.890601e-36  3.995908e+00
 [6,] -1.273799e-21  3.411716e-35  1.228949e-35  5.261020e-36  6.037559e-36
 [7,] -3.761586e-36  1.313533e-36  8.756884e-37  4.378442e-37  2.189221e-37
 [8,]  1.390333e-20  4.376036e-35  3.661683e-35  1.118682e-35  7.847102e-36
 [9,] -2.573967e-21  4.574581e-35  2.053830e-35  1.398684e-35  6.993418e-36
[10,] -1.274682e-21  1.542622e-35  1.028414e-35  5.795083e-36  2.549301e-36
[11,]  1.730326e-35 -6.557369e-36 -9.872541e-36 -9.117839e-36 -1.536485e-36
[12,]  1.805548e-35  2.159102e-35  6.068294e-36  1.801901e-36  3.567111e-36
[13,]  6.965614e-22 -2.586485e-35 -2.454147e-35 -7.312281e-36 -4.223611e-36
[14,]  1.203683e-35 -6.105459e-35 -4.070306e-35 -1.710099e-35 -1.048801e-35
[15,]  4.561904e-21 -1.018138e-34 -7.525338e-35 -3.759463e-35 -1.792666e-35
[16,] -6.612254e-21  2.150485e-35  1.932475e-35  7.168283e-36  3.468520e-36
               [,6]          [,7]          [,8]          [,9]         [,10]
 [1,]  1.479583e-21 -1.208655e-20 -5.398780e-21 -7.153072e-21  6.675099e-21
 [2,]  3.209057e-36  3.209057e-36  6.418114e-36  6.418114e-36 -1.014972e-36
 [3,]  1.110945e-35  1.110945e-35  2.204080e-35  2.221889e-35 -3.503987e-37
 [4,]  6.781823e-36  7.978310e-36  2.108213e-35  2.101405e-35  5.312605e-37
 [5,] -9.037291e-37 -1.388614e-36 -2.401885e-36 -2.777227e-36  1.351818e-36
 [6,]  3.996898e+00  1.088833e-35  1.837702e-35  4.377396e-36 -4.297471e-37
 [7,]  3.020650e-36  3.994486e+00  8.756884e-37  3.427459e-36 -3.340956e-52
 [8,]  1.515839e-35  1.569480e-35  3.986204e+00  4.825608e-35  7.073211e-36
 [9,]  6.150819e-36  1.480516e-35  4.168266e-35  3.974705e+00 -1.978981e-36
[10,]  4.910318e-36  5.142072e-36  1.018952e-35  1.201797e-35  3.996003e+00
[11,] -2.411979e-36 -1.969911e-36 -1.661666e-36 -3.939821e-36  2.241309e-36
[12,]  1.527412e-35  6.657693e-36  9.361005e-36 -7.121437e-36 -8.753178e-37
[13,] -1.095342e-36 -8.104338e-36 -1.620868e-35 -1.620868e-35  3.058132e-37
[14,] -2.035153e-35 -2.035153e-35 -4.070306e-35 -4.070306e-35  7.531329e-36
[15,] -4.112440e-35 -3.776053e-35 -7.518926e-35 -7.788362e-35  1.323588e-38
[16,]  7.696367e-36  7.021421e-36  2.211999e-35  1.356753e-35 -2.738538e-37
              [,11]         [,12]         [,13]         [,14]         [,15]
 [1,] -2.557208e-21  8.642578e-21  2.134414e-21 -1.109153e-21  1.816855e-21
 [2,]  5.104146e-38 -2.004574e-51 -3.209057e-36 -6.418114e-36 -1.388816e-35
 [3,] -2.672765e-50  2.798828e-37 -1.110945e-35 -2.221889e-35 -2.423407e-35
 [4,] -4.677338e-51  2.528718e-36 -7.978310e-36 -1.595662e-35 -3.574514e-35
 [5,] -6.978384e-37  1.538931e-37  6.058781e-36  4.890920e-36  2.937709e-35
 [6,]  1.870196e-37 -1.839184e-36 -5.474566e-36 -1.707477e-35  9.505194e-37
 [7,] -6.473102e-52 -2.076931e-36 -4.378442e-37 -8.756884e-37 -1.751377e-36
 [8,]  1.907488e-36  8.944693e-37  1.499702e-35 -5.278211e-35 -1.203914e-34
 [9,] -1.395976e-38  3.554119e-36 -1.651971e-35 -1.818382e-35 -1.035126e-34
[10,] -4.009147e-51  6.627017e-37 -5.142072e-36 -1.028414e-35 -1.504002e-35
[11,]  3.992523e+00  4.993674e-39  8.632606e-36  4.312484e-36  3.842554e-35
[12,]  3.312395e-37  3.993851e+00 -2.057118e-36 -7.690806e-36  1.601730e-35
[13,] -9.421976e-37  1.661722e-36  3.996164e+00  8.174920e-36  3.241735e-35
[14,]  7.602346e-37 -2.138212e-50  1.857912e-35  3.995916e+00  8.140612e-35
[15,] -7.440113e-37  4.760053e-36  4.213358e-35  7.280188e-35  3.992971e+00
[16,]  6.180885e-37  2.425053e-37 -4.818490e-36 -7.193152e-36 -3.026539e-35
              [,16]
 [1,] -3.349538e-21
 [2,]  3.209057e-36
 [3,]  1.110945e-35
 [4,]  2.230194e-36
 [5,] -2.195273e-36
 [6,]  1.058796e-35
 [7,]  4.378442e-37
 [8,]  1.653460e-35
 [9,]  1.398684e-35
[10,]  6.168843e-37
[11,] -2.103109e-36
[12,]  5.673835e-36
[13,] -7.159778e-36
[14,] -1.837143e-35
[15,] -4.243767e-35
[16,]  3.995738e+00

Model rank =  160 / 160 

Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.

                                        k'  edf k-index p-value   
s(knot):participant.dimvp004.x_mean_z 9.00 8.96    0.97   0.035 * 
s(knot):participant.dimvp004.y_mean_z 9.00 8.97    0.97   0.015 * 
s(knot):participant.dimvp005.x_mean_z 9.00 8.99    0.97   0.045 * 
s(knot):participant.dimvp005.y_mean_z 9.00 8.99    0.97   0.015 * 
s(knot):participant.dimvp006.x_mean_z 9.00 8.99    0.97   0.010 **
s(knot):participant.dimvp006.y_mean_z 9.00 8.99    0.97   0.020 * 
s(knot):participant.dimvp008.x_mean_z 9.00 8.99    0.97   0.015 * 
s(knot):participant.dimvp008.y_mean_z 9.00 8.96    0.97   0.040 * 
s(knot):participant.dimvp009.x_mean_z 9.00 8.95    0.97   0.020 * 
s(knot):participant.dimvp009.y_mean_z 9.00 8.98    0.97   0.020 * 
s(knot):participant.dimvp010.x_mean_z 9.00 8.98    0.97   0.020 * 
s(knot):participant.dimvp010.y_mean_z 9.00 8.99    0.97   0.015 * 
s(knot):participant.dimvp020.x_mean_z 9.00 8.99    0.97   0.030 * 
s(knot):participant.dimvp020.y_mean_z 9.00 8.99    0.97   0.035 * 
s(knot):participant.dimvp023.x_mean_z 9.00 8.99    0.97   0.010 **
s(knot):participant.dimvp023.y_mean_z 9.00 8.99    0.97   0.015 * 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
acf_resid(GAM)

Code
gam_pred <- get_predictions(GAM,
                        cond = list(knot = 0:10,
                                    participant.dim = rhotic_data_static_long$participant.dim %>% levels()),
                        se = FALSE,
                        print.summary = FALSE) %>% 
  separate_wider_delim(participant.dim, ".", names = c("participant", "dimension")) %>% 
  pivot_wider(names_from = dimension, values_from = "fit")

gam_pred_plot <- ggplot(gam_pred) +
  aes(x_mean_z, y_mean_z, group = participant, color = participant) +
  geom_path() +
  geom_point(aes(color= participant)) +
  geom_text(aes(label = knot), check_overlap = TRUE,
            nudge_x = .05, nudge_y = .05, show.legend = FALSE) +
  theme(legend.position = "bottom")
Code
gam_pred_plot
Figure 9: GAM predicted contours

Or with a greater resolution:

Code
gam_pred_res <- get_predictions(GAM,
                        cond = list(knot = seq(0, 10, by = 0.1),
                                    participant.dim = rhotic_data_static_long$participant.dim %>% levels()),
                        se = FALSE,
                        print.summary = FALSE) %>% 
  separate_wider_delim(participant.dim, ".", names = c("participant", "dimension")) %>% 
  pivot_wider(names_from = dimension, values_from = "fit")

gam_pred_plot_res <- ggplot(gam_pred_res) +
  aes(x_mean_z, y_mean_z, group = participant, color = participant) +
  geom_path() +
  theme(legend.position = "bottom")
Code
gam_pred_plot_res
Figure 10: GAM predicted contours (predicted 100 “knots” instead of 10

We can observe the same general tendencies as in the Polar GAM prediction. Participant 8 shows the most raised tongue tip, participant 9 the most bunched tongue tip. The remaining participants fall between these two extremes. Participants 5, 6, 9 and 20 have a similarly high tongue dorsum, while the other participants’ tongue dorsums are slightly flatter (esp. participant 8’s tongue dorsum is clearly flatter). Between tongue dorsum and tongue root (knot 2 and 3), the participants are close together – at the tongue root (knot 0 and 1), the participants are spreading further apart. This was also the case in the Procrustean normalized Polar GAM – but not for the linear normalization. There, the GAM rather showed a wide spread at the transition from tongue dorsum to tongue root while the tongue roots of the participants were closer together.

We could now calculate difference plots between all of the participants. However, with 8 participants, this would result in 56 pairwise comparisons, which is quite extensive. Therefore, we will begin by generating a single difference plot to examine its structure. Since the largest difference appears to be between participant 8 and 9, we will visualize their differences in both the x and the y dimension.

Code
plot_diff(GAM, view = "knot", comp = list(participant.dim = c("vp009.x_mean_z", "vp008.x_mean_z")),
            print.summary = FALSE, main = "participant 9 - participant 8, Dimension x", ylab = "x: 9 - 8")
Figure 11: GAM difference plots: participant 8 vs participant 9
Code
plot_diff(GAM, view = "knot", comp = list(participant.dim = c("vp009.y_mean_z", "vp008.y_mean_z")),
            print.summary = FALSE, main = "participant 9 - participant 8, Dimension y", ylab = "y: 9 - 8")

It is important to note that these difference plots only compare the differences at the specific knots. For instance, it reveals a significant difference in knot 3 of participant 8 and 9 in both dimensions. It would not reveal if instead of knot 3 of both participants, e.g., knot 3 of participant 8 and knot 2 of participant 9 are very close to each other. This is indeed the case, as can be seen in the following plot:

Code
ex_diff_knots_close <- gam_pred %>% filter(participant %in% c("vp008", "vp009")) %>% 
  ggplot() +
  aes(x_mean_z, y_mean_z, group = participant, color = participant) +
  geom_path() +
  geom_point(aes(color= participant)) +
  geom_text(aes(label = knot), check_overlap = TRUE,
            nudge_x = .05, nudge_y = .05, show.legend = FALSE) +
  theme(legend.position = "bottom")
Code
ex_diff_knots_close
Figure 12: Example of two different knots (2 and 3) of two participants (8 and 9) being close to each other

1.1.2.2 Multivariate GAM from Stefano Coretta’s mv_uti

Instead of using the univariate GAM function, we will now follow the mv_uti tutorial by Stefano Coretta and calculate a multivariate (or – in this case – a bivariate) GAM. In this case the x and y variables will be modeled as “two variables changing along knot number” (Coretta and Sakr 2025).

This MGAM consists of:

  • a list in which we define two model formulas – one for x, one for y
  • mvn(d = 2) a multivariate normal family with two dimensions (x and y)
  • a smooth over knot for both dependent variables s(knot, …) with a by variable that is in this case participant
  • a k that is set to 5, because “X/Y coordinates of tongue contours […] are by nature not very ‘wiggly’” (Coretta and Sakr 2025).

A factor smooth over participant, as shown in Coretta (2025), would make no sense here because participant is our independet variable.

Code
mv_gam <- mgcv::gam(list(
    x_mean_z ~ factor(participant) +
      s(knot, by = factor(participant), k = 5),
    y_mean_z ~ factor(participant) +
      s(knot, by = factor(participant), k = 5)
  ),
  data = rhotic_data_static,
  family = mvn(d = 2) # multivariate normal family, two dimensions
)

GAM summary:

Code
summary(mv_gam)

Family: Multivariate normal 
Link function: 

Formula:
x_mean_z ~ factor(participant) + s(knot, by = factor(participant), 
    k = 5)
y_mean_z ~ factor(participant) + s(knot, by = factor(participant), 
    k = 5)

Parametric coefficients:
                             Estimate Std. Error z value Pr(>|z|)
(Intercept)                 1.603e-16  2.853e-03       0        1
factor(participant)vp005   -1.545e-16  3.487e-03       0        1
factor(participant)vp006   -1.420e-16  3.481e-03       0        1
factor(participant)vp008   -1.390e-16  3.518e-03       0        1
factor(participant)vp009   -6.511e-17  4.022e-03       0        1
factor(participant)vp010   -9.632e-17  3.481e-03       0        1
factor(participant)vp020   -7.809e-17  3.474e-03       0        1
factor(participant)vp023   -1.248e-16  3.478e-03       0        1
(Intercept).1              -2.220e-18  3.604e-03       0        1
factor(participant)vp005.1  5.681e-17  4.405e-03       0        1
factor(participant)vp006.1  8.390e-17  4.398e-03       0        1
factor(participant)vp008.1  2.883e-17  4.444e-03       0        1
factor(participant)vp009.1  7.621e-17  5.081e-03       0        1
factor(participant)vp010.1  8.032e-17  4.398e-03       0        1
factor(participant)vp020.1  1.301e-16  4.389e-03       0        1
factor(participant)vp023.1  1.152e-16  4.393e-03       0        1

Approximate significance of smooth terms:
                                     edf Ref.df Chi.sq p-value    
s(knot):factor(participant)vp004   3.998      4 121296  <2e-16 ***
s(knot):factor(participant)vp005   3.999      4 245397  <2e-16 ***
s(knot):factor(participant)vp006   3.999      4 247914  <2e-16 ***
s(knot):factor(participant)vp008   3.999      4 231143  <2e-16 ***
s(knot):factor(participant)vp009   3.996      4 123435  <2e-16 ***
s(knot):factor(participant)vp010   3.999      4 247690  <2e-16 ***
s(knot):factor(participant)vp020   3.999      4 250653  <2e-16 ***
s(knot):factor(participant)vp023   3.999      4 249810  <2e-16 ***
s.1(knot):factor(participant)vp004 3.997      4  74351  <2e-16 ***
s.1(knot):factor(participant)vp005 3.999      4 152936  <2e-16 ***
s.1(knot):factor(participant)vp006 3.999      4 154482  <2e-16 ***
s.1(knot):factor(participant)vp008 3.996      4 145582  <2e-16 ***
s.1(knot):factor(participant)vp009 3.998      4  75331  <2e-16 ***
s.1(knot):factor(participant)vp010 3.999      4 154141  <2e-16 ***
s.1(knot):factor(participant)vp020 3.999      4 155728  <2e-16 ***
s.1(knot):factor(participant)vp023 3.999      4 154722  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Deviance explained = 98.1%
-REML = -73932  Scale est. = 1         n = 23870

We now want to plot the predicted tongue contours (step by step).

We need a grid to base our predictions on:

Code
# Create a grid of values to predict for
frame_mv_gam <- expand_grid(
  # All the speakers
  participant = unique(rhotic_data_static$participant),
  # Knots from 0 to 10 by increments of 0.1
  # This gives us greater resolution along the tongue contour than just using 10 knots
  knot = seq(0, 10, by = 0.1)
)

Now we extract predictions with predict():

Code
mv_gam_predict <- predict(mv_gam, frame_mv_gam, se.fit = TRUE) |>
  as.data.frame() |>
  as_tibble()

# Rename columns
colnames(mv_gam_predict) <- c("x_mean_z", "y_mean_z", "x_se", "y_se")

# combine both data frames
mv_gam_predict <- bind_cols(frame_mv_gam, mv_gam_predict) %>% 
  mutate(
    x_lo = x_mean_z - (1.96 * x_se),
    x_hi = x_mean_z + (1.96 * x_se),
    y_lo = y_mean_z - (1.96 * y_se),
    y_hi = y_mean_z + (1.96 * y_se)
  )

Now we plot:

Code
mv_gam_prediction_plot <- mv_gam_predict %>% 
  ggplot(aes(x_mean_z, y_mean_z, colour = participant)) +
  geom_point(size = 1, alpha = 0.75) +
  coord_fixed() +
  theme(legend.position = "bottom")
Code
mv_gam_prediction_plot
Figure 13: Multivariate GAM predicted contours: dots
Code
mv_gam_prediction_plot2 <- mv_gam_predict %>% 
  ggplot(aes(group = participant, color = participant)) +
  geom_path(aes(x = x_mean_z, y = y_mean_z)) +
  geom_path(aes(x = x_lo, y = y_lo), linetype = "dotted") +
  geom_path(aes(x = x_hi, y = y_hi), linetype = "dotted") +
  coord_fixed() +
  theme(legend.position = "bottom")
Code
mv_gam_prediction_plot2
Figure 14: Multivariate GAM predicted contours: lines

In sum, this looks similar to the predictions of the univariate GAM shown in Figure 9.

1.2 (M)FPCA

MFPCA stands for Multivariate Functional Principal Component Analysis. For a general tutorial see Gubian et al (2015).

1.2.1 Polar coordinates

The FPCA calculation using the Polar coordinates is based on Gubian (2025). We will first calculate an FPCA model. In that model we will be able to see the overall variation of our data. Then we will test if there’s a difference between the participants by calculating a linear model using the PC scores.

We will build a funData object to calculate a FPCA. For this, we will use the Procrustean normalized data from Section 1.1.1.2.

We use the long2irregFunData function from the landmarkregUtils package to convert our data to the correct data object. In the FPCA, the angle serves as a predictor variable for the dependent variable radius.

Code
# build a funData object
fpca_curves_static_polar_proc <- long2irregFunData(proc_static_polar_curves, 
                                  id = "frame_id",
                                  time = "angle", # predictor variable
                                  value = "radius") %>% # dependent variable
  as.funData()
fpca_curves_static_polar_proc
Functional data with 2170 observations of 1-dimensional support
argvals:
    1.136487 1.267944 ... 2.451058      (11 sampling points)
X:
    array of size 2170 x 11 

Then we can calculate the FPCA using the PACE function:

Code
# Compute FPCA
fpca_polar_static_proc <- PACE(fpca_curves_static_polar_proc)

# Prop of explained var
PropExpVar <- round(fpca_polar_static_proc$values  / sum(fpca_polar_static_proc$values), digits = 3)
Code
PropExpVar
[1] 0.492 0.247 0.103 0.090 0.038 0.031

The PACE function automatically chooses how many Principle Components (PCs) to compute. Here, it has computed 6 PCs. The 6 PCs cover 99 % of the variance in the data (Gubian 2025).

The first PC explains 49 % of the variance in the data, the second PC 24 % of the variance and the third PC 10 %. We will now take a closer look at these three PCs.

Code
nPC <- 3
# scores st. dev.
sdFun <- fpca_polar_static_proc$values %>% sqrt()

# PC curves to be plotted
PCcurves_polar_static_proc <- expand_grid(PC = 1:nPC,
                        fractionOfStDev = seq(-1, 1, by=.25)) %>%
  group_by(PC, fractionOfStDev) %>%
  reframe(
    funData2long1(fpca_polar_static_proc$mu + fractionOfStDev * sdFun[PC] * fpca_polar_static_proc$functions[PC],
                  time = "angle", value = "radius")
  )
# Plot
FPCA_polar_static_proc_plot <- ggplot(PCcurves_polar_static_proc) +
  aes(x = angle, y = radius, group = fractionOfStDev, color = fractionOfStDev) +
  geom_line() +
  scale_color_gradient2(low = "blue", mid = "grey", high = "orangered",
                        breaks = c(-1, 0 , 1)) +
  facet_wrap(~ PC, ncol = 2,
             labeller = labeller(PC = ~ str_glue("PC{.x}"))) +
  labs(color = expression(frac(s[k], sigma[k]))) +
  geom_line(data = PCcurves_polar_static_proc %>% filter(fractionOfStDev == 0), color = 'black', linewidth = 1.2) +
  radial_plot(4) +
  theme(legend.position = "bottom")
Code
FPCA_polar_static_proc_plot
Figure 15: FPCA: 3 Principle Components for Polar static data

The 3 PCs describe different shape variations in the data (regardless of the variable participant). Pictured in black is the predicted mean tongue contour. The colored lines show the effect of the PCs – the blue lines show a deviation from the mean up to -1 standard deviation, in orange up to +1 standard deviation of the respective PC. The different PCs are pictured independently – PC1 in the top left panel, PC2 in the top right panel, PC3 in the bottom panel.

The first PC score describes a variation of the tongue contour mainly in the front part of the tongue. A larger PC1 score (pictured in orange) describes a more downward pointed tongue tip, a smaller PC1 score (pictured in blue) describes a more raised tongue tip. Additionally, the PC1 score describes a general retraction / fronting of the tongue: A larger PC1 score causes the whole contour to be slightly retracted, a smaller PC1 score causes the contour to be more fronted.

If we consider this in regard to the rhotic we are investigating, we could expect a smaller PC1 score describing alveolar/front rhotics (raised tongue tip, tongue moves forward) and a larger PC1 score describing uvular/back rhotics (bunched tongue, tongue tip is pointing downward, tongue dorsum is raised).

PC2 mainly describes a tongue movement along the vertical axis. A large PC2 score describes a general lowering of the tongue dorsum and a slight raising of the tongue tip, a smaller PC2 score describes a raising of the tongue dorsum and a slight lowering of the tongue tip. In general, a larger PC2 score describes a flat tongue contour, while a smaller PC2 score describes a steeper tongue with an arched tongue dorsum. We could apply this pattern to the rhotics as well – a smaller PC2 score may be an indication for a uvular/back rhotic, a larger PC2 score may indicate a front/alveolar rhotic – although we would probably expect a more raised tongue tip. PC2 could also describe variation within the uvular rhotics: a uvular fricative would be expected to look like a tongue contour with a low PC2 score (very high tongue dorsum that probably causes a friction), a uvular approximant would probably look more like a tongue contour with a higher PC2.

PC3 also describes a height variation: A low PC3 score describes a general raising of the whole tongue (tongue tip & tongue dorsum) – a high PC3 score describes a lowering of the whole tongue.

We will now plot the PC scores of the participants to see if they differ in their PC scores.

Code
# collect PC scores
PCscores_static_polar <- fpca_polar_static_proc$scores %>%
  `colnames<-`( paste0("s", 1:fpca_polar_static_proc$npc)) %>%
  as_tibble() %>%
  bind_cols(proc_static_polar_curves %>% distinct(frame_id, participant), .)

# boxplots PC scores by Category
FPCA_static_polar_boxplot <- PCscores_static_polar %>% 
  pivot_longer(cols = s1:all_of(str_glue("s{fpca_polar_static_proc$npc}")), 
               names_to = "PC", values_to = "score") %>% 
  filter(PC %in% str_c("s", 1:nPC)) %>% 
  ggplot() +
  aes(x = participant, y = score, color = participant) +
  geom_boxplot() +
  facet_wrap(~ PC) +
  theme(legend.position = "bottom")
Code
FPCA_static_polar_boxplot
Figure 16: FPCA: distribution of PC scores by participant

The boxplots show that the participants have a great variation in PC1 and PC2, but very similar PC3 scores. It seems like PC1 and PC2 are dependent on the variable participant while PC3 is independent. The variation pictured through the PC3 score could maybe be explained by another variable.

Another way to plot the PC1 and PC2 scores:

Code
# scatterplot PC scores s1 and s2 by participant
pc_mean <- PCscores_static_polar %>% 
  group_by(participant) %>% 
  summarise(
    mean_s1 = mean(s1),
    mean_s2 = mean(s2),
    mean_s3 = mean(s3)
  ) %>% 
  ungroup()

PCscores_static_polar_scatterplot <- ggplot(PCscores_static_polar) +
  aes(x = s1, y = s2, color = participant) +
  geom_point(alpha = 0.1) +
  geom_point(data = pc_mean, mapping = aes(x = mean_s1, y = mean_s2, color = participant),pch = 17, size = 4) +
  stat_ellipse() +
  theme(legend.position = "right")
Code
PCscores_static_polar_scatterplot
Figure 17: FPCA: distribution of PC scores by participant – scatterplot

There is considerable variation among participants in the PC1 scores (s1). While participant 8 has the lowest s1, participant 9 has the highest. The other participants fall somewhere in between but are grouped relatively close together. Participants 6 and 10 show scores below the mean (0), while participants 4, 5, 20 and 23 show scores above the mean. As shown in Figure 15, s1 describes a retraction/fronting and a lowering/raising of the tongue tip. Participant 8, along other participants exhibiting relatively low s1 values, is therefore expected to have a predicted tongue contour that is fronted and has a raised tongue tip – which may indicate alveolar/front rhotics. In contrast, participant 9 and those with relatively high s1 values are expected to show a retracted tongue contour and have a downward pointing tongue tip – which may suggest uvular/back rhotics.

For the PC2 score (s2), there seem to be two groups of participants: participants 5, 6 and 20 have a low s2, while the other participants (4, 8, 9, 10 and 23) have a rather high s2. While the two groups are not clearly separated, the mean scores reveal a distinct split – despite some overlap in the data points. Looking back at Figure 15, we know that s2 describes the “steepness” of the tongue shape. A low s2 (as seen in participants 5, 6 and 20) describes a higher tongue dorsum/steeper tongue shape and a high s2 (as seen in participants 4, 8, 9, 10 and 23) describes a flatter tongue shape.

Now we’re interested in calculating a linear model (lm) to see if the PC scores correlate with the variable participant (lm(PCscore ~ participant)) or if this is just random. Can we predict s1/s2 by participant?

Code
# s1
s1_mod <- lm(s1 ~ participant, data = PCscores_static_polar)
s1_mod %>% summary()

Call:
lm(formula = s1 ~ participant, data = PCscores_static_polar)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.48373 -0.06849  0.00802  0.07166  0.60342 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)       0.066865   0.009405   7.109 1.58e-12 ***
participantvp005  0.026871   0.011494   2.338   0.0195 *  
participantvp006 -0.095997   0.011476  -8.365  < 2e-16 ***
participantvp008 -0.347785   0.011597 -29.988  < 2e-16 ***
participantvp009  0.183255   0.013258  13.822  < 2e-16 ***
participantvp010 -0.164758   0.011476 -14.357  < 2e-16 ***
participantvp020 -0.004366   0.011452  -0.381   0.7030    
participantvp023  0.010185   0.011464   0.888   0.3744    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1163 on 2162 degrees of freedom
Multiple R-squared:  0.5933,    Adjusted R-squared:  0.5919 
F-statistic: 450.5 on 7 and 2162 DF,  p-value: < 2.2e-16

Pairwise comparisons of participants using emmeans:

Code
emmeans(s1_mod, pairwise ~ participant)
$emmeans
 participant  emmean      SE   df lower.CL upper.CL
 vp004        0.0669 0.00941 2162   0.0484   0.0853
 vp005        0.0937 0.00661 2162   0.0808   0.1067
 vp006       -0.0291 0.00658 2162  -0.0420  -0.0162
 vp008       -0.2809 0.00679 2162  -0.2942  -0.2676
 vp009        0.2501 0.00934 2162   0.2318   0.2684
 vp010       -0.0979 0.00658 2162  -0.1108  -0.0850
 vp020        0.0625 0.00653 2162   0.0497   0.0753
 vp023        0.0770 0.00655 2162   0.0642   0.0899

Confidence level used: 0.95 

$contrasts
 contrast      estimate      SE   df t.ratio p.value
 vp004 - vp005 -0.02687 0.01150 2162  -2.338  0.2735
 vp004 - vp006  0.09600 0.01150 2162   8.365  <.0001
 vp004 - vp008  0.34778 0.01160 2162  29.988  <.0001
 vp004 - vp009 -0.18326 0.01330 2162 -13.822  <.0001
 vp004 - vp010  0.16476 0.01150 2162  14.357  <.0001
 vp004 - vp020  0.00437 0.01150 2162   0.381  0.9999
 vp004 - vp023 -0.01018 0.01150 2162  -0.888  0.9871
 vp005 - vp006  0.12287 0.00932 2162  13.180  <.0001
 vp005 - vp008  0.37466 0.00947 2162  39.559  <.0001
 vp005 - vp009 -0.15638 0.01140 2162 -13.664  <.0001
 vp005 - vp010  0.19163 0.00932 2162  20.556  <.0001
 vp005 - vp020  0.03124 0.00929 2162   3.362  0.0179
 vp005 - vp023  0.01669 0.00931 2162   1.793  0.6251
 vp006 - vp008  0.25179 0.00945 2162  26.648  <.0001
 vp006 - vp009 -0.27925 0.01140 2162 -24.439  <.0001
 vp006 - vp010  0.06876 0.00930 2162   7.394  <.0001
 vp006 - vp020 -0.09163 0.00927 2162  -9.884  <.0001
 vp006 - vp023 -0.10618 0.00928 2162 -11.436  <.0001
 vp008 - vp009 -0.53104 0.01150 2162 -45.985  <.0001
 vp008 - vp010 -0.18303 0.00945 2162 -19.371  <.0001
 vp008 - vp020 -0.34342 0.00942 2162 -36.457  <.0001
 vp008 - vp023 -0.35797 0.00943 2162 -37.944  <.0001
 vp009 - vp010  0.34801 0.01140 2162  30.457  <.0001
 vp009 - vp020  0.18762 0.01140 2162  16.454  <.0001
 vp009 - vp023  0.17307 0.01140 2162  15.163  <.0001
 vp010 - vp020 -0.16039 0.00927 2162 -17.302  <.0001
 vp010 - vp023 -0.17494 0.00928 2162 -18.842  <.0001
 vp020 - vp023 -0.01455 0.00926 2162  -1.572  0.7671

P value adjustment: tukey method for comparing a family of 8 estimates 

A small p-value means that, according to the linear model, the predicted tongue contours of the two compared participants are significantly different.

Now we should be able to reconstruct the tongue contours that are predicted by s1 for each participant:

Code
# reconstruct predicted curves
predCurves_s1 <- emmeans(s1_mod, pairwise ~ participant)$emmeans %>%
  as_tibble() %>%  
  group_by(participant) %>% 
  reframe(bind_cols(
    funData2long1(fpca_polar_static_proc$mu + emmean * fpca_polar_static_proc$functions[1], time = 'angle', value = "radius"),
    funData2long1(fpca_polar_static_proc$mu + lower.CL * fpca_polar_static_proc$functions[1], time = 'angle', value = "yl") %>% 
      select(yl),
    funData2long1(fpca_polar_static_proc$mu + upper.CL * fpca_polar_static_proc$functions[1], time = 'angle', value = "yu") %>% 
      select(yu)
  ))

FPCA_static_polar_predplot_s1 <- ggplot(predCurves_s1) +
  aes(angle, radius, color = participant) +
  geom_line() +
  geom_ribbon(aes(x = angle, ymin = yl, ymax = yu, fill = participant),
              alpha = 0.3, inherit.aes = FALSE) +
  radial_plot(5) +
  theme(legend.position = "bottom")
Code
FPCA_static_polar_predplot_s1
Figure 18: FPCA: Predicted tongue contours by PC1 score

In this prediction plot, we can see the wide range of tongue contours that the lm predicted based on the participants’ PC1 scores. As expected, participant 9 has the most retracted tongue contour with the most downward pointing tongue tip while participant 8 has the most fronted tongue contour with a raised tongue tip.

We can do the same type of model for s2:

Code
# s2
s2_mod <- lm(s2 ~ participant, data = PCscores_static_polar)
s2_mod %>% summary()

Call:
lm(formula = s2 ~ participant, data = PCscores_static_polar)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.261095 -0.059586  0.003327  0.060401  0.259206 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)       0.073559   0.007115  10.339  < 2e-16 ***
participantvp005 -0.152863   0.008695 -17.581  < 2e-16 ***
participantvp006 -0.204168   0.008681 -23.519  < 2e-16 ***
participantvp008  0.022647   0.008773   2.582  0.00990 ** 
participantvp009 -0.025948   0.010029  -2.587  0.00974 ** 
participantvp010 -0.011536   0.008681  -1.329  0.18404    
participantvp020 -0.175577   0.008663 -20.268  < 2e-16 ***
participantvp023  0.025178   0.008672   2.903  0.00373 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.088 on 2162 degrees of freedom
Multiple R-squared:  0.5284,    Adjusted R-squared:  0.5269 
F-statistic: 346.1 on 7 and 2162 DF,  p-value: < 2.2e-16

Pairwise comparisons of participants using emmeans:

Code
emmeans(s2_mod, pairwise ~ participant)
$emmeans
 participant  emmean      SE   df lower.CL upper.CL
 vp004        0.0736 0.00711 2162   0.0596   0.0875
 vp005       -0.0793 0.00500 2162  -0.0891  -0.0695
 vp006       -0.1306 0.00497 2162  -0.1404  -0.1209
 vp008        0.0962 0.00513 2162   0.0861   0.1063
 vp009        0.0476 0.00707 2162   0.0337   0.0615
 vp010        0.0620 0.00497 2162   0.0523   0.0718
 vp020       -0.1020 0.00494 2162  -0.1117  -0.0923
 vp023        0.0987 0.00496 2162   0.0890   0.1085

Confidence level used: 0.95 

$contrasts
 contrast      estimate      SE   df t.ratio p.value
 vp004 - vp005  0.15286 0.00869 2162  17.581  <.0001
 vp004 - vp006  0.20417 0.00868 2162  23.519  <.0001
 vp004 - vp008 -0.02265 0.00877 2162  -2.582  0.1631
 vp004 - vp009  0.02595 0.01000 2162   2.587  0.1610
 vp004 - vp010  0.01154 0.00868 2162   1.329  0.8880
 vp004 - vp020  0.17558 0.00866 2162  20.268  <.0001
 vp004 - vp023 -0.02518 0.00867 2162  -2.903  0.0724
 vp005 - vp006  0.05131 0.00705 2162   7.276  <.0001
 vp005 - vp008 -0.17551 0.00716 2162 -24.499  <.0001
 vp005 - vp009 -0.12691 0.00866 2162 -14.660  <.0001
 vp005 - vp010 -0.14133 0.00705 2162 -20.042  <.0001
 vp005 - vp020  0.02271 0.00703 2162   3.231  0.0274
 vp005 - vp023 -0.17804 0.00704 2162 -25.288  <.0001
 vp006 - vp008 -0.22681 0.00715 2162 -31.734  <.0001
 vp006 - vp009 -0.17822 0.00864 2162 -20.619  <.0001
 vp006 - vp010 -0.19263 0.00703 2162 -27.384  <.0001
 vp006 - vp020 -0.02859 0.00701 2162  -4.077  0.0012
 vp006 - vp023 -0.22935 0.00702 2162 -32.655  <.0001
 vp008 - vp009  0.04859 0.00874 2162   5.563  <.0001
 vp008 - vp010  0.03418 0.00715 2162   4.783  0.0001
 vp008 - vp020  0.19822 0.00713 2162  27.819  <.0001
 vp008 - vp023 -0.00253 0.00714 2162  -0.355  1.0000
 vp009 - vp010 -0.01441 0.00864 2162  -1.667  0.7084
 vp009 - vp020  0.14963 0.00863 2162  17.348  <.0001
 vp009 - vp023 -0.05113 0.00863 2162  -5.921  <.0001
 vp010 - vp020  0.16404 0.00701 2162  23.393  <.0001
 vp010 - vp023 -0.03671 0.00702 2162  -5.227  <.0001
 vp020 - vp023 -0.20075 0.00700 2162 -28.675  <.0001

P value adjustment: tukey method for comparing a family of 8 estimates 
Code
# reconstruct predicted curves
predCurves_s2 <- emmeans(s2_mod, pairwise ~ participant)$emmeans %>%
  as_tibble() %>%  
  group_by(participant) %>% 
  reframe(bind_cols(
    funData2long1(fpca_polar_static_proc$mu + emmean * fpca_polar_static_proc$functions[2], time = 'angle', value = "radius"),
    funData2long1(fpca_polar_static_proc$mu + lower.CL * fpca_polar_static_proc$functions[2], time = 'angle', value = "yl") %>% 
      select(yl),
    funData2long1(fpca_polar_static_proc$mu + upper.CL * fpca_polar_static_proc$functions[2], time = 'angle', value = "yu") %>% 
      select(yu)
  ))

FPCA_static_polar_predplot_s2 <- ggplot(predCurves_s2) +
  aes(angle, radius, color = participant) +
  geom_line() +
  geom_ribbon(aes(x = angle, ymin = yl, ymax = yu, fill = participant),
              alpha = 0.3, inherit.aes = FALSE) +
  radial_plot(5) +
  theme(legend.position = "bottom")
Code
FPCA_static_polar_predplot_s2
Figure 19: FPCA: Predicted tongue contours by PC2 score

1.2.2 Cartesian coordinates

We aim to predict the tongue contours based on the 11 knots. To achieve more fine-grained predictions and plots, we add artificial “extra knots” between the original knots (Coretta and Sakr 2025). The first step is to create a grid for these predictions.

Code
sampling_period <- 0.1
maxknot <- max(rhotic_data_static$knot)
grid <- seq(0, maxknot, by = sampling_period)

We now need to make the data longer and create two separate columns: One column called “dim” that tells us the dimension of the value (either x-axis or y-axis) and one column called “value” with the position of the sensor on the respective axis/dimension.

Code
curves_xy <- rhotic_data_static %>% 
  pivot_longer(cols = c(x_mean_z, y_mean_z), names_to = "dim", values_to = "value") %>% 
  group_by(filename, frame_id, participant, dim) %>% 
  reframe(approx(knot, value, grid, rule = 2) %>% as_tibble()) %>% 
  ungroup() %>% rename(knot = x, value = y)

The next step is to build a multiFunData object (from the funData package). These are a class for multivariate functional data. We use the long2irregFunData function from the landmarkregUtils package to convert our data to the correct data object.

Code
# build a multiFunData object
curvesFun2D <- lapply(c("x_mean_z", "y_mean_z"), function(y)
  long2irregFunData(curves_xy %>% filter(dim == {{y}}),
                    id = "frame_id",
                    time = "knot",
                    value = "value") %>% 
    as.funData()
) %>% 
  multiFunData()

Now we have the curvesFun2D object that we need to compute the FPCA.

We will now compute 5 Principle Components – which is more than necessary for our purposes. This is “a quick way to get a good approx for the % explained var per PC” (Gubian 2025).

Code
# Compute FPCA
nPC <- 5
mfpca_cartesian_static <- MFPCA(curvesFun2D,
               M = nPC,
               uniExpansions = list(list(type = "uFPCA"),list(type = "uFPCA"))
)

# Prop of explained var
PropExpVar <- round(mfpca_cartesian_static$values / sum( mfpca_cartesian_static$values) , digits = 3)
# scores st dev
sdFun <- mfpca_cartesian_static$values %>% sqrt()

PropExpVar
[1] 0.538 0.263 0.099 0.062 0.037

The 1st PC explains about 54 % of the variance of the data, the 2nd PC explains 26 % and the 3rd still explains at least 10 % of the variance.

Going further, we will use the first two PCs and plot the according curves.

Code
nPC <- 2
# PC curves to be plotted
PCcurves_xy <- expand_grid(PC = 1:nPC,
                        dim = 1:2, 
                        fractionOfStDev = seq(-1, 1, by=.25)) %>%
  group_by(PC, dim, fractionOfStDev) %>%
  reframe(
    funData2long1(
      mfpca_cartesian_static$meanFunction[[dim]] +
        fractionOfStDev * sdFun[PC] * mfpca_cartesian_static$functions[[dim]][PC],
      time= "knot", value = "y")
  ) %>% 
  mutate(dim = factor(dim, levels = c(1,2), labels = c('x_mean_z', 'y_mean_z')))

# Plot
FPCA_2PCs_plot <- ggplot(PCcurves_xy) +
  aes(x = knot, y = y, group = fractionOfStDev, color = fractionOfStDev) +
  geom_line() +
  scale_color_gradient2(low = "blue", mid = "grey", high = "orangered",
                        breaks = c(-1, 0 , 1)) +
  facet_grid(dim ~ PC,
             scales = "free_y",
             labeller = labeller(PC = ~ str_glue("PC{.x}"))) + #,
                                 # dim = dimlabels)) +
  labs(color = expression(frac(s[k], sigma[k]))) +
  geom_line(data = PCcurves_xy %>% filter(fractionOfStDev == 0), color = 'black', linewidth = 1.2) 

FPCA_2PCs_plot

On the left, the results of PC1 are pictured, PC2 is pictured on the right. In the top row, the y-value (tongue movement from the bottom to the top of the mouth) is pictured along the y-axis. In the bottom row, the x-values are pictured along the y-axis (tongue movement from back/posterior [small x-value] to front/anterior [large x-value]). The x-axis shows the 11 knots of the AAA tongue splines (tongue root/most posterior knot = 0; tongue tip/most anterior knot = 10).

We can observe that PC1 mainly describes movement along the y-axis, both in the back region of the graph/tongue (around knot 2 to 5) and the front of the tongue (around knot 7 to 10). A higher PC1 score corresponds to a lower position of the back part of the tongue (= lower y-value), while a lower PC1 score corresponds to a higher position (= higher y-value). It also has an influence on the tongue tip (knot 7 to 10): the lower the PC1 score, the lower the tongue tip – the higher PC1, the higher the tongue tip. Along the x-axis, PC1 influences the graph/the tongue shape to be slightly more advanced or retracted, affecting nearly the entire region from knot 2 to knot 8. The higher the PC1 score, the lower the value of the knots along the x-axis, resulting in a more fronted graph/tongue shape compared to the mean. This means that the tongue is more posterior (further back) if the PC1 score is higher; and more anterior (more fronted) if the PC1 score is lower. Interestingly, the opposite is true for the back and front part of the tongue (knot 0 to 2 and knot 9 to 10) - here, the tongue is more posterior (further back) if the PC1 score is lower and more anterior (more fronted) if the PC2 score is higher. However, these movements are relatively small compared to the movements along the y-axis.

Regarding the rhotics that we are investigating, the PC1 score reveals differences in tongue height around both the uvular region (knots 3 to 7) and the alveolar region (knots 8 to 10). A higher PC1 score may tell us that the tongue is further away from the palate in the uvular region, a lower PC1 score may be an indicator that the tongue is approaching the palate. For the alveolar region, a low PC1 score describes a bunched rhotic with the tongue tip pointing downward while a high PC1 score describes the tongue tip as pointing upward.

We would suspect that uvular fricatives have a low PC1 score because the tongue should be close to the uvular region of the palate while articulating uvular fricatives. Uvular approximants should have a low PC1 score as well, but not as low as the fricatives. Alveolar rhotics should have a high PC1 score, because they are not articulated with an approach of the tongue to the uvular. Instead, they have a raised tongue tip which corresponds to a high PC1 score.

PC2 mainly describes movement in knots 5 to 10, that is, the front part of the tongue. A low PC2 score causes a higher y-value and a slightly higher x-value in the front part of the mouth, a high PC2 score causes a lower y-value and a slightly lower x-value. This means the tongue tip is raised and slightly more anterior (fronted) for a low PC2 and the tongue tip is pointed downward and slightly more retracted for a higher PC2.

When considering the rhotics, we would expect uvular rhotics to have higher PC2 scores, as the tongue tip goes downward and the tongue tip might be retracted for uvular rhotics. For alveolar rhotics we expect low PC2 scores, because the tongue tip is typically raised and the tongue might be fronted.

We can look at all of this in a trajectory plot as well, this will show us both dimensions in one plot:

Code
FPCA_knot_traj_plot <- ggplot(PCcurves_xy %>% pivot_wider(names_from = dim, values_from = y) %>% filter(knot %in% c(0,1,2,3,4,5,6,7,8,9,10))) +
  aes(x = x_mean_z, y = y_mean_z, group = fractionOfStDev, color = fractionOfStDev) +
  geom_path() + geom_point(colour = "black") +
  scale_color_gradient2(low = "blue", mid = "grey", high = "orangered",
                        breaks = c(-1, 0 , 1)) +
  facet_grid(. ~ PC,
             # scales = "free_y",
             labeller = labeller(PC = ~ str_glue("PC{.x}"))) +
  labs(color = expression(frac(s[k], sigma[k]))) +
  theme(legend.position = "right")
Code
FPCA_knot_traj_plot
Figure 20: FPCA: Trajectory Plot; PC1 and PC2

Differences between the speakers

Our first goal is to see if there is a difference between the speakers. Since the use of uvular and alveolar rhotics is typically speaker-specific, we would expect clear differences in PC1 and PC2 scores across speakers. Speakers using (mainly) alveolar rhotics should have a higher PC1 and a lower PC2 score overall, assuming the theory described above is accurate. Speakers using (mainly) uvular rhotics should have a lower PC1 and a higher PC2 score in comparison.

Code
nPC = 5
# collect PC scores
PCscores_xy <- mfpca_cartesian_static$scores %>%
  `colnames<-`(paste0("s", 1:nPC)) %>%
  as_tibble() %>%
  bind_cols(curves_xy %>% distinct(filename, frame_id, participant), .)
# note that the PC scores are the same for both dimensions

# scatterplot PC scores s1 and s2 by participant
pc_mean <- PCscores_xy %>% 
  group_by(participant) %>% 
  summarise(
    mean_s1 = mean(s1),
    mean_s2 = mean(s2),
    mean_s3 = mean(s3)
  ) %>% 
  ungroup()

PCscores_per_participants_plot <- ggplot(PCscores_xy) +
  aes(x = s1, y = s2, color = participant) +
  geom_point(alpha = 0.1) +
  geom_point(data = pc_mean, mapping = aes(x = mean_s1, y = mean_s2, color = participant),pch = 17, size = 4) +
  stat_ellipse() +
  theme(legend.position = "right")
Code
PCscores_per_participants_plot
Figure 21: FPCA: distribution of PC scores by participant

In general, we can observe that the participants differ a lot in the PC1 score (s1), but seem to not differ much at all with regard to the PC2 score (s2). So the differences in s2 do not seem to be speaker-specific

We also plot the PC scores as boxplots. In this case we will have a short look at all 5 PC scores that we calculated, to see if there might be another PC score that correlates even stronger to the variable participant than the PC1 score:

Code
FPCA_cartesian_static_scores_boxplot <- PCscores_xy %>%
  pivot_longer(cols = paste0("s", 1:nPC), 
               names_to = "PC", values_to = "score") %>% 
  ggplot() +
  aes(x = participant, y = score, color = participant) +
  geom_boxplot() +
  facet_wrap(~ PC) +
  theme(legend.position = "bottom", axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
Code
FPCA_cartesian_static_scores_boxplot
Figure 22: FPCA: distribution of all 5 PC scores by participant

It looks like s1 has the strongest correlation to the variable participant, while s2 looks like it doesn’t correlate at all. The other PC scores might be weakly correlated to the variable participant. We will leave it at that and concentrate on the PC1 score.

Participant 9 shows the lowest s1, while participant 8 shows the highest s1. As described above, a high s1 describes a lowering of the tongue in the uvular region and a raising of the tongue in the alveolar region what could be interpreted as the pronunciation of an alveolar rhotic (e.g., an alveolar trill). A low s1 describes the opposite – a high tongue dorsum and a low tongue tip, what would be the case for uvular rhotics (e.g., uvular fricatives).

The other participants have a s1 that falls between those of participant 9 and 8.

Linear Model

The question arises, if the correlation between s1 and the variable participant is significant. To test that, we calculate a linear model using the configuration lm(s2 ~ participant).

Code
s <- 1 # score index
model_eq <- str_glue("s{s} ~ participant") %>% as.formula()
mod <- lm(model_eq, data = PCscores_xy)
mod %>% summary()

Call:
lm(formula = model_eq, data = PCscores_xy)

Residuals:
    Min      1Q  Median      3Q     Max 
-0.7904 -0.1623 -0.0098  0.1526  1.2988 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)      -0.12442    0.01853  -6.713 2.43e-11 ***
participantvp005 -0.28477    0.02265 -12.572  < 2e-16 ***
participantvp006  0.09213    0.02261   4.074 4.79e-05 ***
participantvp008  1.10180    0.02285  48.212  < 2e-16 ***
participantvp009 -0.77859    0.02613 -29.801  < 2e-16 ***
participantvp010  0.53049    0.02261  23.458  < 2e-16 ***
participantvp020 -0.17428    0.02257  -7.722 1.73e-14 ***
participantvp023  0.04878    0.02259   2.159   0.0309 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2292 on 2162 degrees of freedom
Multiple R-squared:  0.8271,    Adjusted R-squared:  0.8265 
F-statistic:  1477 on 7 and 2162 DF,  p-value: < 2.2e-16

Pairwise comparisons of participants using emmeans:

Code
emmeans(mod, pairwise ~ participant)
$emmeans
 participant  emmean     SE   df lower.CL upper.CL
 vp004       -0.1244 0.0185 2162  -0.1608 -0.08807
 vp005       -0.4092 0.0130 2162  -0.4347 -0.38365
 vp006       -0.0323 0.0130 2162  -0.0577 -0.00687
 vp008        0.9774 0.0134 2162   0.9512  1.00360
 vp009       -0.9030 0.0184 2162  -0.9391 -0.86689
 vp010        0.4061 0.0130 2162   0.3807  0.43149
 vp020       -0.2987 0.0129 2162  -0.3239 -0.27344
 vp023       -0.0756 0.0129 2162  -0.1010 -0.05031

Confidence level used: 0.95 

$contrasts
 contrast      estimate     SE   df t.ratio p.value
 vp004 - vp005   0.2848 0.0227 2162  12.572  <.0001
 vp004 - vp006  -0.0921 0.0226 2162  -4.074  0.0012
 vp004 - vp008  -1.1018 0.0229 2162 -48.212  <.0001
 vp004 - vp009   0.7786 0.0261 2162  29.801  <.0001
 vp004 - vp010  -0.5305 0.0226 2162 -23.458  <.0001
 vp004 - vp020   0.1743 0.0226 2162   7.722  <.0001
 vp004 - vp023  -0.0488 0.0226 2162  -2.159  0.3772
 vp005 - vp006  -0.3769 0.0184 2162 -20.517  <.0001
 vp005 - vp008  -1.3866 0.0187 2162 -74.296  <.0001
 vp005 - vp009   0.4938 0.0226 2162  21.897  <.0001
 vp005 - vp010  -0.8153 0.0184 2162 -44.381  <.0001
 vp005 - vp020  -0.1105 0.0183 2162  -6.034  <.0001
 vp005 - vp023  -0.3335 0.0183 2162 -18.186  <.0001
 vp006 - vp008  -1.0097 0.0186 2162 -54.228  <.0001
 vp006 - vp009   0.8707 0.0225 2162  38.671  <.0001
 vp006 - vp010  -0.4384 0.0183 2162 -23.921  <.0001
 vp006 - vp020   0.2664 0.0183 2162  14.584  <.0001
 vp006 - vp023   0.0434 0.0183 2162   2.369  0.2570
 vp008 - vp009   1.8804 0.0228 2162  82.633  <.0001
 vp008 - vp010   0.5713 0.0186 2162  30.684  <.0001
 vp008 - vp020   1.2761 0.0186 2162  68.746  <.0001
 vp008 - vp023   1.0530 0.0186 2162  56.643  <.0001
 vp009 - vp010  -1.3091 0.0225 2162 -58.140  <.0001
 vp009 - vp020  -0.6043 0.0225 2162 -26.895  <.0001
 vp009 - vp023  -0.8274 0.0225 2162 -36.784  <.0001
 vp010 - vp020   0.7048 0.0183 2162  38.581  <.0001
 vp010 - vp023   0.4817 0.0183 2162  26.329  <.0001
 vp020 - vp023  -0.2231 0.0182 2162 -12.230  <.0001

P value adjustment: tukey method for comparing a family of 8 estimates 

The emmeans function reveals that there are significant differences in the PC1 scores of all the contrast except for the contrast between participant 4 and participant 23 (p = 0.3132) and participant 6 and 23 (p = 0.3627). [Note: This is actually interesting because participant 23 uses an alveolar rhotic while 4 and 6 use an uvular rhotic, so we would expect 4 and 6 to be different from 23. Maybe this variation is covered by one of the other PC scores.]

Code
predCurves <- emmeans(mod, pairwise ~ participant)$emmeans %>%
  as_tibble() %>%
  expand_grid(dim = 1:2) %>% 
  group_by(participant, dim) %>% 
  reframe(bind_cols(
    funData2long1(mfpca_cartesian_static$meanFunction[[dim]] +
                    emmean * mfpca_cartesian_static$functions[[dim]][s], time = "knot", value = "y"),
    funData2long1(mfpca_cartesian_static$meanFunction[[dim]] +
                    lower.CL * mfpca_cartesian_static$functions[[dim]][s], time = "knot", value = "y_l") %>% 
      select(y_l),
    funData2long1(mfpca_cartesian_static$meanFunction[[dim]] +
                    upper.CL * mfpca_cartesian_static$functions[[dim]][s], time = "knot", value = "y_u") %>% 
      select(y_u)
  )) %>% 
  mutate(dim = factor(dim, levels = c(1,2), labels = c('x_mean_z', 'y_mean_z'))) %>% 
  pivot_wider(names_from = dim, values_from = starts_with("y"))


FPCA_knot_pred_plot <- ggplot(predCurves) +
  aes(group = participant, color = participant) +
  geom_path(aes(x = y_x_mean_z, y = y_y_mean_z)) +
  geom_path(aes(x = y_l_x_mean_z, y = y_l_y_mean_z), linetype = "dotted") +
  geom_path(aes(x = y_u_x_mean_z, y = y_u_y_mean_z), linetype = "dotted") +
  xlab("x_mean_z") + ylab("y_mean_z") +
  theme(legend.position = "bottom")

The predicted plots from the linear model for the participants look like this:

Code
FPCA_knot_pred_plot
Figure 23: FPCA: Predicted tongue contours by participant based on PC1 score

In comparison with the GAM predictions (e.g., Figure 10), this looks very similar in general although this is only the prediction based on the first PC score. But there are also differences. Especially the tongue shape of participant 23 looks different than in the GAM prediction plot. In the FPCA prediction, the tongue shape of participant 23 is very close to the tongue shapes of participants 4 and 6, while in the GAM prediction, the tongue tip points more upward and the general shape looks more like that of participant 10.

Keep in mind that this plot only includes the variation that is described by PC1 and does not include the variation covered by other PCs. The variation of participant 23’s exact tongue shape may be covered by one of the other PCs.

2 dynamic data

After the static analyses, we will now try to expand the models so they also include the time axis. The time variable is an additional independent variable in our data set.

Read in the (Cartesian) data set:

Code
rhotic_data <- readRDS("../data/RDS/rhotic_data.rds")

This data set is similar to the one we used for the static data – with the addition that it has a normtime variable and has multiple tongue contours for one rhotic (one for each frame within the rhotic). The rhotics consist of a varying number of time frames since the rhotics have varying lengths (shortest /r/ = 1 frame (extended to 2 frames in the data_preparation script); longest /r/ = 15 frames), so the “clips” of the tongue moving contain different numbers of frames (different than in Gubian (2025) where all clips have exactly 6 frames). In Section 2.2.1 we will change that and interpolate the data.

When picking one rhotic as an example, we can observe how the tongue moves over time. Here’s the example of the /r/ in the word “Rotation” being articulated by all of the participants:

Code
dynamic_data_example <- rhotic_data |> filter(target == "Rotation") %>% 
  ggplot(aes(x_z, y_z, group = frame_id_file, colour = normtime)) +
  coord_cartesian() +
  geom_point() +
  geom_path(alpha = 0.5) +
  facet_wrap(vars(participant), ncol = 4)
Code
dynamic_data_example
Figure 24: Example of dynamic data; target word “Rotation” for all participants

In this plot, the dark blue color marks the beginning of the rhotic, the light blue color the end of the rhotic. It can be observed that most of the participants lower their tongue (or at least their tongue tip) during the articulation of the rhotic – the only exception being participant 23, who raises their tongue and only lowers it in the very last frame.

2.1 GAMM

2.1.1 Polar coordinates

Read in the data set containing the dynamic Polar coordinates:

Code
rhotic_data_polar <- readRDS("../data/RDS/rhotic_data_polar_dyn.rds")

We will again perform a Procrustean angle normalization as in Section 1.1.1.2:

Code
angle_range <- rhotic_data_polar  %>% 
  filter(knot %in% range(knot)) %>% # first and last knot
  group_by(knot) %>% 
  summarise(median_angle = median(angle)) %>% 
  pull(median_angle) %>% 
  sort()

angle_grid <- seq(angle_range[1], angle_range[2], length.out=11)

proc_polar_curves <- rhotic_data_polar  %>% 
  group_by(participant, clip_id, normtime, target) %>% 
  # interpolation
  reframe(bind_cols(
    knot = rev(seq(along.with = angle_grid, from = 0)), # fake knots, needed for AR1 term in GAM
    approx(angle, radius, angle_grid, rule = 2) %>% as_tibble()
    )) %>% 
  rename(angle = x,radius = y)

Now we can plot one example of the Polar data. As above, we choose the example target “Rotation”

Code
dataset_ex_flat <- proc_polar_curves %>% filter(target == "Rotation") %>%
  ggplot() +
  aes(x = angle, y= normtime) +
  geom_raster(aes(fill = radius)) +
  facet_grid(~ participant) +
  scale_x_reverse() +
  scale_fill_gradientn(colours = terrain.colors(10))
Code
dataset_ex_flat
Figure 25: Example of polar dynamic data displayed flat

What do we see?

  • y-axis: normtime – at the bottom is the start of the rhotic, at the top is the end of the rhotic
  • x-axis: angle – on the left of each plot is the biggest angle = the left side of the tongue = tongue root. At the right of each plot is the smallest angle = right side of the tongue = tongue tip.
  • color: radius – green = small radius = tongue is close to the origin; yellow-ish/brown-ish = medium radius = tongue is further away from the origin; white = large radius = tongue is far away from the origin

an example: We’re looking at the tongue tip of participant 6, so the rightmost part of their plot that has the smallest angle. At the bottom (at 0 (%) of the normtime), the plot starts with a very light-brown color. That’s the very start of the rhotic. This means that the tongue tip has to be far away from the origin, so it’s probably raised in the beginning. If when following the tongue tip to the top, so up to the end (time-wise) of the rhotic, we observe that the color becomes a darker brown over time and even gets yellow-ish at the very top (at 100 %). This means that the radius gets smaller: the tongue tip position changes and moves closer to the origin. So, the tongue tip of participant 6 probably starts with being raised and then moves downward over the course of the rhotic. The opposite is true for the tongue tip of participant 23: Here, the color on the bottom of the very right of the plot is yellow-ish and gets white at a normtime of ~ 80 %, what means that the tongue tip moves upward.

We will also picture the same plot as a normal Polar plot:

Code
dataset_ex_polar <- proc_polar_curves %>% filter(target == "Rotation") %>%
  ggplot() +
  aes(angle, radius, group = normtime, color = normtime) +
  geom_path() +
  facet_wrap(vars(participant), ncol = 3) +
  radial_plot(5) +
  theme(legend.position = "bottom")
Code
dataset_ex_polar
Figure 26: Example of polar dynamic data displayed in a polar coordinate system

We’re doing the same as in Section 1.1.1 but use a tensor product smooth interaction, containing angle and normtime instead of the angle-smooth (Gubian 2025; Wieling 2018).

Code
GAM_dyn_polar <- bam(radius ~ participant + te(angle, normtime, by = factor(participant)),
            discrete = TRUE,
            family = 'scat',
            data = proc_polar_curves
)
AR1.rho <- acf_resid(GAM_dyn_polar)[2]

Taking autocorrelation into account:

Code
GAM_dyn_polar <- bam(radius ~ participant + te(angle, normtime, by = factor(participant)),
                          discrete = TRUE,
                        family = 'scat',
                        rho = AR1.rho,
                        AR.start = proc_polar_curves %>% pull(knot) == 0,
                        data = proc_polar_curves
)
summary(GAM_dyn_polar)

Family: Scaled t(3.573,0.142) 
Link function: identity 

Formula:
radius ~ participant + te(angle, normtime, by = factor(participant))

Parametric coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)       3.038428   0.003744 811.580  < 2e-16 ***
participantvp005  0.006014   0.005173   1.163  0.24498    
participantvp006 -0.062410   0.004391 -14.214  < 2e-16 ***
participantvp008 -0.065073   0.004909 -13.256  < 2e-16 ***
participantvp009  0.047635   0.005098   9.344  < 2e-16 ***
participantvp010 -0.028021   0.004886  -5.735 9.79e-09 ***
participantvp020 -0.011477   0.004337  -2.646  0.00814 ** 
participantvp023  0.046037   0.004605   9.998  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                                               edf Ref.df     F p-value    
te(angle,normtime):factor(participant)vp004  9.905 10.683 14247  <2e-16 ***
te(angle,normtime):factor(participant)vp005 22.723 23.762 15415  <2e-16 ***
te(angle,normtime):factor(participant)vp006 19.927 22.135 30959  <2e-16 ***
te(angle,normtime):factor(participant)vp008 17.604 19.937 27958  <2e-16 ***
te(angle,normtime):factor(participant)vp009  9.000  9.006 13203  <2e-16 ***
te(angle,normtime):factor(participant)vp010 23.395 23.945 20750  <2e-16 ***
te(angle,normtime):factor(participant)vp020 18.498 20.926 33862  <2e-16 ***
te(angle,normtime):factor(participant)vp023 23.845 23.996 19687  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.952   Deviance explained = 81.8%
fREML = 1.7002e+05  Scale est. = 1         n = 118954
Code
gam.check(GAM_dyn_polar)


Method: fREML   Optimizer: perf chol
$grad
 [1]  1.475264e-12 -1.917022e-11  9.743317e-13  4.202860e-12  8.449685e-12
 [6] -9.017054e-11 -1.637712e-11 -1.861293e-10  4.044911e-10 -5.811770e-04
[11] -6.119105e-12  2.216645e-10 -2.832401e-12 -3.832135e-11  9.774404e-13
[16]  5.894840e-11

$hess
               [,1]          [,2]          [,3]          [,4]          [,5]
 [1,]  2.997074e+00 -1.199438e-04 -4.614681e-09  3.421072e-07  8.914322e-12
 [2,] -1.199438e-04  8.826878e-02 -5.494092e-07  4.072803e-05  1.061298e-09
 [3,] -4.614681e-09 -5.494092e-07  3.018012e+00  1.352097e-01  1.763464e-08
 [4,]  3.421072e-07  4.072803e-05  1.352097e-01  4.753801e+00 -6.668579e-07
 [5,]  8.914322e-12  1.061298e-09  1.763464e-08 -6.668579e-07  2.999628e+00
 [6,] -1.236452e-09 -1.472063e-07 -2.446008e-06  9.249448e-05  1.657878e-02
 [7,] -1.400239e-14 -1.667060e-12 -2.770007e-11  1.047483e-09 -1.776020e-09
 [8,] -1.160682e-12 -1.381854e-10 -2.296105e-09  8.682759e-08 -1.472230e-07
 [9,]  2.340396e-17  2.786031e-15  4.629271e-14 -1.750567e-12  2.968053e-12
[10,]  7.393219e-19  8.801841e-17  1.462521e-15 -5.530547e-14  9.376931e-14
[11,] -2.477520e-20 -2.157959e-18 -3.572381e-17  1.350902e-15 -2.290429e-15
[12,] -4.770949e-19 -7.349701e-17 -1.230472e-15  4.653049e-14 -7.889159e-14
[13,] -2.148014e-21 -1.191201e-20  1.807830e-20 -6.836011e-19  1.159166e-18
[14,] -7.361438e-20 -6.223946e-20 -9.296161e-19  3.515384e-17 -5.960111e-17
[15,] -3.164340e-21 -3.385261e-20 -3.524419e-23  1.400162e-21 -2.224012e-21
[16,]  1.407208e-19  5.890248e-19 -1.343038e-21  4.974567e-20 -8.878105e-20
               [,6]          [,7]          [,8]          [,9]         [,10]
 [1,] -1.236452e-09 -1.400239e-14 -1.160682e-12  2.340946e-17  7.394084e-19
 [2,] -1.472063e-07 -1.667060e-12 -1.381854e-10  2.786041e-15  8.801855e-17
 [3,] -2.446008e-06 -2.770007e-11 -2.296105e-09  4.629271e-14  1.462521e-15
 [4,]  9.249448e-05  1.047483e-09  8.682759e-08 -1.750567e-12 -5.530547e-14
 [5,]  1.657878e-02 -1.776020e-09 -1.472230e-07  2.968053e-12  9.376931e-14
 [6,]  5.774156e+00 -8.428807e-07 -6.987315e-05  1.408582e-09  4.450109e-11
 [7,] -8.428807e-07  2.991786e+00  5.738070e-02  3.817469e-08  1.205696e-09
 [8,] -6.987315e-05  5.738070e-02  1.151110e+00  9.245894e-07  2.916644e-08
 [9,]  1.408582e-09  3.817469e-08  9.245894e-07  2.998276e+00  3.524668e-07
[10,]  4.450109e-11  1.205696e-09  2.916644e-08  3.524668e-07  5.815124e-04
[11,] -1.086994e-12 -2.945922e-11 -7.135052e-10 -3.558056e-08 -8.009306e-10
[12,] -3.744045e-11 -1.014694e-09 -2.457599e-08 -1.225539e-06 -2.759081e-08
[13,]  5.501184e-16  1.490906e-14  3.610989e-13  1.800701e-11  4.053108e-13
[14,] -2.828555e-14 -7.665822e-13 -1.856670e-11 -9.258702e-10 -2.084011e-11
[15,] -1.055476e-18 -2.860503e-17 -6.928168e-16 -3.454887e-14 -7.776434e-16
[16,] -4.213375e-17 -1.141890e-15 -2.765669e-14 -1.379163e-12 -3.104291e-14
              [,11]         [,12]         [,13]         [,14]         [,15]
 [1,] -1.540183e-20 -7.038136e-19 -2.055317e-22  7.165330e-20 -1.200719e-20
 [2,] -2.140769e-18 -7.433558e-17  3.462863e-22  1.932725e-19 -4.149223e-20
 [3,] -3.572380e-17 -1.230472e-15  1.807946e-20 -9.295799e-19 -3.772144e-23
 [4,]  1.350902e-15  4.653049e-14 -6.836757e-19  3.515156e-17  1.536621e-21
 [5,] -2.290429e-15 -7.889159e-14  1.159166e-18 -5.960111e-17 -2.224011e-21
 [6,] -1.086994e-12 -3.744045e-11  5.501184e-16 -2.828555e-14 -1.055477e-18
 [7,] -2.945922e-11 -1.014694e-09  1.490906e-14 -7.665822e-13 -2.860503e-17
 [8,] -7.135052e-10 -2.457599e-08  3.610989e-13 -1.856670e-11 -6.928168e-16
 [9,] -3.558056e-08 -1.225539e-06  1.800701e-11 -9.258702e-10 -3.454887e-14
[10,] -8.009306e-10 -2.759081e-08  4.053108e-13 -2.084011e-11 -7.776434e-16
[11,]  3.114096e+00  2.806731e-01  2.500328e-08 -1.285625e-06 -4.797218e-11
[12,]  2.806731e-01  5.570726e+00  1.437460e-07 -7.392049e-06 -2.757968e-10
[13,]  2.500328e-08  1.437460e-07  2.999310e+00  1.099923e-02 -2.330439e-09
[14,] -1.285625e-06 -7.392049e-06  1.099923e-02  4.288733e+00 -7.879517e-07
[15,] -4.797218e-11 -2.757968e-10 -2.330439e-09 -7.879517e-07  3.376904e+00
[16,] -1.915011e-09 -1.100959e-08 -9.302924e-08 -3.145437e-05  4.359635e-01
              [,16]
 [1,] -3.055066e-21
 [2,] -1.063754e-20
 [3,] -1.385488e-21
 [4,]  5.242049e-20
 [5,] -8.878097e-20
 [6,] -4.213377e-17
 [7,] -1.141890e-15
 [8,] -2.765669e-14
 [9,] -1.379163e-12
[10,] -3.104291e-14
[11,] -1.915011e-09
[12,] -1.100959e-08
[13,] -9.302924e-08
[14,] -3.145437e-05
[15,]  4.359635e-01
[16,]  6.036145e+00

Model rank =  200 / 200 

Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.

                                              k'  edf k-index p-value    
te(angle,normtime):factor(participant)vp004 24.0  9.9    0.94  <2e-16 ***
te(angle,normtime):factor(participant)vp005 24.0 22.7    0.94  <2e-16 ***
te(angle,normtime):factor(participant)vp006 24.0 19.9    0.94  <2e-16 ***
te(angle,normtime):factor(participant)vp008 24.0 17.6    0.94  <2e-16 ***
te(angle,normtime):factor(participant)vp009 24.0  9.0    0.94  <2e-16 ***
te(angle,normtime):factor(participant)vp010 24.0 23.4    0.94  <2e-16 ***
te(angle,normtime):factor(participant)vp020 24.0 18.5    0.94  <2e-16 ***
te(angle,normtime):factor(participant)vp023 24.0 23.8    0.94  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
acf_resid(GAM_dyn_polar)

Get the predictions and plot:

Code
normtime_grid <- (0:5)/5
pred <- get_predictions(GAM_dyn_polar,
                        cond = list(angle = angle_grid_plot, 
                                    normtime = normtime_grid,
                                    participant = factor(rhotic_data_static_polar$participant) %>% levels()),
                        print.summary = FALSE)

gam_dyn_polar_flat_plot <- ggplot(pred) +
  aes(x = angle, y = normtime, group = participant) +
  geom_raster(aes(fill = fit)) +
  facet_grid(~ participant) +
  scale_x_reverse() +
  scale_fill_gradientn(colours = terrain.colors(10))

gam_polar_dyn_polar_plot <- ggplot(pred) +
  aes(x = angle, y = fit, group = normtime, color = normtime) +
  geom_path() +
  facet_wrap(vars(participant), ncol = 3) +
  radial_plot(5) +
  theme(legend.position = "bottom")
Code
gam_dyn_polar_flat_plot
Figure 27: GAM prediction of polar dynamic data (flat version)
Code
gam_polar_dyn_polar_plot
Figure 28: GAM prediction of polar dynamic data (polar version)

It gets evident from the plots that the tongue movements are rather small but highly individual for the participants:

  • participant 4 slightly moves their whole tongue downward
  • participant 5, 6 and 9 move their tongue slightly backward
  • participant 8 moves their tongue tip upward
  • participant 10 and 23 moves their tongue tip first upward and then downward (like a tap)
  • participant 20 moves their tongue downward and backward

This can give us a general hint about the articulation of the participants, but it would probably be more meaningful if we took the phonetic surrounding (e.g., following vowel) into account [what I will do in my dissertation].

We look at the different tongue shapes of the participants at one time point to compare them, e.g., at 0.5/50 % of the normalized time:

Code
sec_time <- 0.5
pred <- get_predictions(GAM_dyn_polar,
                        cond = list(angle = angle_grid_plot,
                                    normtime = sec_time,
                                    participant = factor(rhotic_data_static_polar$participant) %>% levels()),
                        print.summary = FALSE)

sec_time_plot <- ggplot(pred) +
  aes(x = angle, y = fit, group = participant ) +
  geom_line(mapping = aes(color = participant)) +
  geom_ribbon(mapping = aes(ymin = fit - CI, ymax = fit + CI, fill = participant), alpha = 0.5) +
  theme_light() +
  radial_plot(5) +
  theme(legend.position = "bottom")
Code
sec_time_plot
Figure 29: Cross-section of GAM prediction of polar dynamic data at normalized time point 0.5

Also, we look at one specific point on the shapes (= one angle value) and see how this point changes over time. Here we will take the “middle” \(\pi\)/2

Code
sec_angle <- pi/2
pred <- get_predictions(GAM_dyn_polar,
                        cond = list(angle = sec_angle,
                                    normtime = normtime_grid,
                                    participant = factor(rhotic_data_static_polar$participant) %>% levels()),
                        print.summary = FALSE)

sec_angle_plot <- ggplot(pred) +
  aes(x = normtime, y = fit, group = participant ) +
  geom_line(mapping = aes(color = participant)) +
  geom_ribbon(mapping = aes(ymin = fit - CI, ymax = fit + CI, fill = participant), alpha = 0.5) +
  theme_light() +
  ylab("Radius") +
  theme(legend.position = "bottom")
Code
sec_angle_plot
Figure 30: Cross-section of GAM prediction of polar dynamic data at angle value \(\pi/2\)

Or we can look at the angle where the tongue tips approximately are (We’re taking the value \(\pi * 7/16\), because that’s very close to the tongue tip):

Code
sec_angle <- pi*7/16
pred <- get_predictions(GAM_dyn_polar,
                        cond = list(angle = sec_angle,
                                    normtime = normtime_grid,
                                    participant = factor(rhotic_data_static_polar$participant) %>% levels()),
                        print.summary = FALSE)

sec_angle_plot_tt <- ggplot(pred) +
  aes(x = normtime, y = fit, group = participant ) +
  geom_line(mapping = aes(color = participant)) +
  geom_ribbon(mapping = aes(ymin = fit - CI, ymax = fit + CI, fill = participant), alpha = 0.5) +
  theme_light() +
  ylab("Radius") +
  theme(legend.position = "bottom")
Code
sec_angle_plot_tt
Figure 31: Cross-section of GAM prediction of polar dynamic data approximately at tongue tip

It’s very interesting to see how the tongue tip of participant 23 moves upward and then takes a turn at a normtime of around 80 %. It could as well be the case that other participants also do taps and move their tongue up and then downward but that this cancels out due to them not doing it consistently at the same normtime.

2.1.2 Cartesian coordinates

2.1.2.1 Univariate GAMM

We will do the analysis similar as with the static data, but additionally, we will take into account the normtime variable. We’ve got a dimensional interaction between knot and normtime – this means that we need to introduce a te() smooth as above with the Polar data:

Code
# create compound factor participant.dim for all combinations of participant and dimension
rhotic_data_long <- rhotic_data %>% 
             pivot_longer(c(x_z, y_z), names_to = 'dimension', values_to = 'position') %>% 
             unite("participant.dim", c(participant, dimension), sep = ".") %>% 
             mutate(participant.dim = factor(participant.dim)) %>% 
             # order to capture AR along knots
             arrange(sort_id, participant.dim, knot)

# calculate GAM
GAM_dyn <- bam(position ~ participant.dim + 
             te(knot, normtime, by = participant.dim),
           data = rhotic_data_long,
           family = 'scat',
           discrete = TRUE)

# autocorellation:
AR1.rho_dyn <- acf_resid(GAM_dyn)[2]

Code
#Model that considers autocorrelation in the residuals
GAM_dyn <- bam(position ~ participant.dim + 
                 te(knot, normtime, by = participant.dim),
           data = rhotic_data_long,
           rho = AR1.rho_dyn, # error model for residuals
            AR.start = rhotic_data_long %>% pull(knot) == 0,
           family = 'scat',
           discrete = TRUE)
summary(GAM_dyn)

Family: Scaled t(5.763,0.122) 
Link function: identity 

Formula:
position ~ participant.dim + te(knot, normtime, by = participant.dim)

Parametric coefficients:
                           Estimate Std. Error t value Pr(>|t|)    
(Intercept)              -0.0411883  0.0017262 -23.861  < 2e-16 ***
participant.dimvp004.y_z  0.0144100  0.0019140   7.529 5.15e-14 ***
participant.dimvp005.x_z -0.0051116  0.0024008  -2.129   0.0332 *  
participant.dimvp005.y_z  0.0245892  0.0024180  10.169  < 2e-16 ***
participant.dimvp006.x_z -0.0148836  0.0020361  -7.310 2.68e-13 ***
participant.dimvp006.y_z  0.0035555  0.0020351   1.747   0.0806 .  
participant.dimvp008.x_z  0.0005764  0.0023874   0.241   0.8092    
participant.dimvp008.y_z  0.0102097  0.0023940   4.265 2.00e-05 ***
participant.dimvp009.x_z  0.0020936  0.0023983   0.873   0.3827    
participant.dimvp009.y_z  0.0165040  0.0023681   6.969 3.19e-12 ***
participant.dimvp010.x_z -0.0131069  0.0022774  -5.755 8.66e-09 ***
participant.dimvp010.y_z  0.0112327  0.0022959   4.893 9.96e-07 ***
participant.dimvp020.x_z -0.0021308  0.0019793  -1.077   0.2817    
participant.dimvp020.y_z  0.0218812  0.0020092  10.890  < 2e-16 ***
participant.dimvp023.x_z -0.0124989  0.0021317  -5.863 4.54e-09 ***
participant.dimvp023.y_z  0.0357944  0.0021457  16.682  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Approximate significance of smooth terms:
                                              edf Ref.df      F p-value    
te(knot,normtime):participant.dimvp004.x_z  8.999  9.005  38408  <2e-16 ***
te(knot,normtime):participant.dimvp004.y_z 13.138 15.064  22872  <2e-16 ***
te(knot,normtime):participant.dimvp005.x_z 20.653 22.567  26722  <2e-16 ***
te(knot,normtime):participant.dimvp005.y_z 21.187 22.934  27064  <2e-16 ***
te(knot,normtime):participant.dimvp006.x_z 20.061 22.195  44189  <2e-16 ***
te(knot,normtime):participant.dimvp006.y_z 18.524 20.901  49165  <2e-16 ***
te(knot,normtime):participant.dimvp008.x_z 21.973 23.413  25413  <2e-16 ***
te(knot,normtime):participant.dimvp008.y_z 22.037 23.447  25999  <2e-16 ***
te(knot,normtime):participant.dimvp009.x_z 11.547 13.151  30162  <2e-16 ***
te(knot,normtime):participant.dimvp009.y_z  9.046  9.092  44111  <2e-16 ***
te(knot,normtime):participant.dimvp010.x_z 22.112 23.505  28878  <2e-16 ***
te(knot,normtime):participant.dimvp010.y_z 23.293 23.912  29358  <2e-16 ***
te(knot,normtime):participant.dimvp020.x_z  9.034  9.068 118813  <2e-16 ***
te(knot,normtime):participant.dimvp020.y_z 20.010 22.208  49276  <2e-16 ***
te(knot,normtime):participant.dimvp023.x_z 21.848 23.395  35520  <2e-16 ***
te(knot,normtime):participant.dimvp023.y_z 23.916 23.997  35157  <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

R-sq.(adj) =  0.977   Deviance explained = 89.9%
fREML = 3.6353e+05  Scale est. = 1         n = 237908

Check the dynamic GAM and the distribution of residuals:

Code
gam.check(GAM_dyn)


Method: fREML   Optimizer: perf chol
$grad
 [1]  1.929412e-10 -1.182874e-03  2.430056e-12 -4.187761e-13 -4.130030e-14
 [6] -1.572076e-13  1.554312e-14 -3.902656e-12  1.332268e-13 -7.114309e-13
[11]  2.002842e-13 -1.150369e-11 -4.529710e-14  8.881784e-16 -1.598721e-14
[16]  3.721468e-13 -5.648815e-13 -1.452947e-10 -6.583862e-10 -9.458721e-04
[21]  1.332268e-15 -7.549517e-14  3.654854e-13  3.028688e-13 -9.217471e-11
[26] -1.136610e-03  2.478018e-13  6.394885e-14  5.151435e-14 -2.113865e-13
[31]  2.775558e-13  7.368328e-12

$hess
               [,1]          [,2]          [,3]          [,4]          [,5]
 [1,]  2.996532e+00  9.547669e-08  7.208955e-06  3.623896e-06 -1.112299e-17
 [2,]  9.547669e-08  1.182302e-03  7.948632e-08  6.904922e-05 -2.074604e-19
 [3,]  7.208955e-06  7.948632e-08  2.999682e+00  2.258219e-04  3.123062e-18
 [4,]  3.623896e-06  6.904922e-05  2.258219e-04  1.597205e+00  9.101548e-19
 [5,] -1.112429e-17 -2.079132e-19  3.125492e-18  9.112120e-19  2.998777e+00
 [6,] -1.732579e-15 -3.231699e-17  4.860202e-16  1.417474e-16  1.965590e-02
 [7,] -4.913732e-19 -9.214581e-21  1.388956e-19  4.067645e-20 -1.632856e-04
 [8,] -5.936277e-16 -1.107505e-17  1.664931e-16  4.855633e-17 -1.511694e-03
 [9,] -1.007718e-17 -1.878453e-19  2.829555e-18  8.242206e-19 -5.379722e-30
[10,] -6.341355e-16 -1.182091e-17  1.780635e-16  5.184711e-17 -3.384978e-28
[11,] -2.972736e-18 -5.548534e-20  8.356916e-19  2.432241e-19 -1.588349e-30
[12,] -3.305295e-17 -6.146738e-19  9.272012e-18  2.694915e-18 -1.761472e-29
[13,] -9.906617e-18 -1.849437e-19  2.782066e-18  8.101716e-19 -5.289752e-30
[14,]  7.717637e-16  1.440012e-17 -2.167860e-16 -6.312955e-17  4.118651e-28
[15,] -4.114474e-18 -7.682294e-20  1.157202e-18  3.346220e-19 -2.191417e-30
[16,] -9.589617e-16 -1.788819e-17  2.692626e-16  7.845027e-17 -5.118946e-28
[17,]  1.368037e-17  2.552533e-19 -3.842061e-18 -1.118851e-18  7.307790e-30
[18,]  2.947705e-16  5.501281e-18 -8.275325e-17 -2.411770e-17  1.574380e-28
[19,] -2.557142e-18 -4.775840e-20  7.203383e-19  2.094796e-19 -1.367598e-30
[20,] -1.672372e-19 -3.120574e-21  4.737432e-20  1.376347e-20 -8.955159e-32
[21,] -1.714873e-17 -3.197583e-19  4.812825e-18  1.402712e-18 -9.154456e-30
[22,]  5.383252e-16  1.004148e-17 -1.510737e-16 -4.405153e-17  2.874584e-28
[23,] -4.392782e-18 -8.193913e-20  1.230909e-18  3.590111e-19 -2.341955e-30
[24,] -1.303119e-15 -2.430182e-17  3.657223e-16  1.066251e-16 -6.960061e-28
[25,] -4.947258e-18 -9.238020e-20  1.390576e-18  4.051054e-19 -2.645661e-30
[26,] -2.606416e-18 -4.864554e-20  7.327353e-19  2.134849e-19 -1.393377e-30
[27,] -1.222213e-18 -2.283855e-20  3.430953e-19  1.000059e-19 -6.542959e-31
[28,]  1.768788e-16  3.295086e-18 -4.969322e-17 -1.447434e-17  9.449123e-29
[29,] -2.719992e-17 -5.073312e-19  7.635798e-18  2.225318e-18 -1.452751e-29
[30,]  1.353152e-15  2.523731e-17 -3.798603e-16 -1.107203e-16  7.227871e-28
[31,] -3.903809e-19 -7.310523e-21  1.099010e-19  3.195656e-20 -2.091840e-31
[32,] -2.871362e-16 -5.355267e-18  8.059512e-17  2.349134e-17 -1.533552e-28
               [,6]          [,7]          [,8]          [,9]         [,10]
 [1,] -1.728179e-15 -4.936711e-19 -5.915077e-16 -1.006493e-17 -6.324185e-16
 [2,] -3.223313e-17 -9.207705e-21 -1.103250e-17 -1.877260e-19 -1.179555e-17
 [3,]  4.852304e-16  1.386108e-19  1.660809e-16  2.825985e-18  1.775676e-16
 [4,]  1.414108e-16  4.039537e-20  4.840099e-17  8.235775e-19  5.174858e-17
 [5,]  1.965590e-02 -1.632856e-04 -1.511694e-03 -5.383293e-30 -3.382329e-28
 [6,]  2.510023e+00  6.672236e-04 -5.392335e-01 -8.371581e-28 -5.261225e-26
 [7,]  6.672236e-04  2.999948e+00  1.797152e-02 -2.396223e-31 -1.498749e-29
 [8,] -5.392335e-01  1.797152e-02  2.716898e+00 -2.868160e-28 -1.801004e-26
 [9,] -8.360376e-28 -2.388836e-31 -2.859959e-28  2.999459e+00  5.449731e-03
[10,] -5.260443e-26 -1.502701e-29 -1.799655e-26  5.449731e-03  6.247506e+00
[11,] -2.466969e-28 -7.055374e-32 -8.449355e-29 -1.192388e-04  1.011087e-05
[12,] -2.734846e-27 -7.826963e-31 -9.373061e-28 -1.020149e-04 -4.026390e-03
[13,] -8.225767e-28 -2.348529e-31 -2.812001e-28 -4.789622e-30 -3.005362e-28
[14,]  6.405782e-26  1.829114e-29  2.193264e-26  3.727822e-28  2.341749e-26
[15,] -3.405078e-28 -9.724553e-32 -1.165642e-28 -1.987401e-30 -1.245355e-28
[16,] -7.956604e-26 -2.272554e-29 -2.722659e-26 -4.632946e-28 -2.909841e-26
[17,]  1.135691e-27  3.244419e-31  3.887781e-28  6.611594e-30  4.155583e-28
[18,]  2.443948e-26  6.988469e-30  8.375563e-27  1.425131e-28  8.948452e-27
[19,] -2.125276e-28 -6.074769e-32 -7.277123e-29 -1.236821e-30 -7.775915e-29
[20,] -1.398508e-29 -3.993665e-33 -4.794521e-30 -8.095854e-32 -5.119961e-30
[21,] -1.421960e-27 -4.063008e-31 -4.867116e-28 -8.279777e-30 -5.204602e-28
[22,]  4.465498e-26  1.275284e-29  1.528167e-26  2.598746e-28  1.634288e-26
[23,] -3.644743e-28 -1.040396e-31 -1.245025e-28 -2.118491e-30 -1.333357e-28
[24,] -1.080887e-25 -3.087318e-29 -3.698550e-26 -6.292556e-28 -3.955735e-26
[25,] -4.104404e-28 -1.173673e-31 -1.406972e-28 -2.395540e-30 -1.501889e-28
[26,] -2.163796e-28 -6.183187e-32 -7.414169e-29 -1.263018e-30 -7.919340e-29
[27,] -1.011726e-28 -2.896355e-32 -3.470381e-29 -5.904776e-31 -3.702401e-29
[28,]  1.467785e-26  4.193938e-30  5.023071e-27  8.546032e-29  5.371435e-27
[29,] -2.255923e-27 -6.445837e-31 -7.725078e-28 -1.314271e-29 -8.257564e-28
[30,]  1.122723e-25  3.207357e-29  3.842442e-26  6.537451e-28  4.107933e-26
[31,] -3.232733e-29 -9.270018e-33 -1.106859e-29 -1.906402e-31 -1.189412e-29
[32,] -2.382673e-26 -6.804101e-30 -8.153918e-27 -1.387024e-28 -8.718022e-27
              [,11]         [,12]         [,13]         [,14]         [,15]
 [1,] -2.968991e-18 -3.307411e-17 -9.893266e-18  7.721170e-16 -4.099863e-18
 [2,] -5.537612e-20 -6.168817e-19 -1.845242e-19  1.440114e-17 -7.646858e-20
 [3,]  8.336199e-19  9.286400e-18  2.777787e-18 -2.167915e-16  1.151141e-18
 [4,]  2.429421e-19  2.706338e-18  8.095311e-19 -6.317961e-17  3.354773e-19
 [5,] -1.588285e-30 -1.769851e-29 -5.294883e-30  4.129510e-28 -2.193442e-30
 [6,] -2.468606e-28 -2.750493e-27 -8.231310e-28  6.419745e-26 -3.410336e-28
 [7,] -7.034576e-32 -7.889648e-31 -2.370347e-31  1.845534e-29 -9.794948e-32
 [8,] -8.466335e-29 -9.428548e-28 -2.821192e-28  2.202867e-26 -1.168966e-28
 [9,] -1.192388e-04 -1.020149e-04 -4.786372e-30  3.734273e-28 -1.983225e-30
[10,]  1.011087e-05 -4.026390e-03 -3.011982e-28  2.350170e-26 -1.247228e-28
[11,]  2.999662e+00  3.031588e-03 -1.413462e-30  1.102952e-28 -5.863296e-31
[12,]  3.031588e-03  3.713635e+00 -1.571433e-29  1.227336e-27 -6.501543e-30
[13,] -1.412055e-30 -1.571605e-29  3.000136e+00  3.830207e-02 -4.571796e-04
[14,]  1.100280e-28  1.224028e-27  3.830207e-02  4.195797e+00 -6.543283e-04
[15,] -5.866739e-31 -6.518926e-30 -4.571796e-04 -6.543283e-04  3.010674e+00
[16,] -1.366806e-28 -1.520123e-27  5.782589e-04  7.971757e-01  1.072676e-01
[17,]  1.952020e-30  2.171635e-29  6.499314e-30 -5.068490e-28  2.691065e-30
[18,]  4.203244e-29  4.680039e-28  1.399324e-28 -1.092593e-26  5.800234e-29
[19,] -3.652885e-31 -4.062559e-30 -1.216858e-30  9.491165e-29 -5.042549e-31
[20,] -2.406459e-32 -2.649987e-31 -8.019196e-32  6.246514e-30 -3.325389e-32
[21,] -2.442041e-30 -2.722271e-29 -8.140084e-30  6.353466e-28 -3.373472e-30
[22,]  7.667824e-29  8.545748e-28  2.554415e-28 -1.994725e-26  1.059634e-28
[23,] -6.249423e-31 -6.980877e-30 -2.080585e-30  1.629393e-28 -8.638112e-31
[24,] -1.856746e-28 -2.068631e-27 -6.187642e-28  4.828591e-26 -2.564409e-28
[25,] -7.052219e-31 -7.867634e-30 -2.350634e-30  1.833730e-28 -9.751792e-31
[26,] -3.719340e-31 -4.147773e-30 -1.238072e-30  9.667528e-29 -5.137990e-31
[27,] -1.739912e-31 -1.942267e-30 -5.802170e-31  4.518472e-29 -2.404585e-31
[28,]  2.519983e-29  2.808451e-28  8.395032e-29 -6.556105e-27  3.484371e-29
[29,] -3.877129e-30 -4.318876e-29 -1.291529e-29  1.008602e-27 -5.353759e-30
[30,]  1.928938e-28  2.148652e-27  6.427389e-28 -5.017770e-26  2.663823e-28
[31,] -5.612540e-32 -6.259812e-31 -1.865483e-31  1.460883e-29 -7.687853e-32
[32,] -4.093298e-29 -4.559182e-28 -1.363505e-28  1.064679e-26 -5.652486e-29
              [,16]         [,17]         [,18]         [,19]         [,20]
 [1,] -9.595827e-16  1.367844e-17  2.948962e-16 -2.563614e-18 -1.693990e-19
 [2,] -1.789765e-17  2.551234e-19  5.500256e-18 -4.781524e-20 -3.159545e-21
 [3,]  2.694273e-16 -3.840571e-18 -8.279963e-17  7.198001e-19  4.756310e-20
 [4,]  7.851927e-17 -1.119258e-18 -2.413032e-17  2.097715e-19  1.386133e-20
 [5,] -5.130170e-28  7.310930e-30  1.576259e-28 -1.370092e-30 -9.058482e-32
 [6,] -7.980433e-26  1.136868e-27  2.451987e-26 -2.132427e-28 -1.408577e-29
 [7,] -2.285385e-29  3.259935e-31  7.001715e-30 -6.103032e-32 -4.022412e-33
 [8,] -2.736127e-26  3.899705e-28  8.405185e-27 -7.302266e-29 -4.827678e-30
 [9,] -4.641893e-28  6.615316e-30  1.425516e-28 -1.239827e-30 -8.195970e-32
[10,] -2.920611e-26  4.162907e-28  8.972320e-27 -7.800766e-29 -5.155904e-30
[11,] -1.369561e-28  1.953432e-30  4.210983e-29 -3.663370e-31 -2.419461e-32
[12,] -1.517194e-27  2.170416e-29  4.677498e-28 -4.061840e-30 -2.685645e-31
[13,]  5.782589e-04  6.506210e-30  1.402136e-28 -1.219411e-30 -8.060239e-32
[14,]  7.971757e-01 -5.069260e-28 -1.092077e-26  9.502096e-29  6.279540e-30
[15,]  1.072676e-01  2.692801e-30  5.813963e-29 -5.056336e-31 -3.343659e-32
[16,]  4.559717e+00  6.300068e-28  1.357244e-26 -1.180298e-28 -7.798789e-30
[17,]  6.310546e-28  2.994757e+00  4.675488e-04  2.946069e-05 -1.696130e-06
[18,]  1.359317e-26  4.675488e-04  7.487328e-01  1.790062e-07  5.264020e-04
[19,] -1.180519e-28  2.946069e-05  1.790062e-07  2.999742e+00  1.447805e-06
[20,] -7.764105e-30 -1.696130e-06  5.264020e-04  1.447805e-06  1.169999e-03
[21,] -7.895372e-28  1.125582e-29  2.426140e-28 -2.109161e-30 -1.394388e-31
[22,]  2.480065e-26 -3.532561e-28 -7.621155e-27  6.622732e-29  4.375838e-30
[23,] -2.020696e-28  2.878457e-30  6.206779e-29 -5.399652e-31 -3.572120e-32
[24,] -6.002633e-26  8.552189e-28  1.844243e-26 -1.602920e-28 -1.059630e-29
[25,] -2.283066e-28  3.251063e-30  7.006587e-29 -6.092915e-31 -4.028819e-32
[26,] -1.203190e-28  1.712748e-30  3.694550e-29 -3.208331e-31 -2.124870e-32
[27,] -5.628772e-29  8.039009e-31  1.729531e-29 -1.502545e-31 -9.926950e-33
[28,]  8.153063e-27 -1.162276e-28 -2.505148e-27  2.174296e-29  1.439710e-30
[29,] -1.253092e-27  1.786011e-29  3.850262e-28 -3.348014e-30 -2.211385e-31
[30,]  6.234352e-26 -8.886489e-28 -1.916284e-26  1.665590e-28  1.100392e-29
[31,] -1.817091e-29  2.595236e-31  5.574391e-30 -4.839400e-32 -3.189178e-33
[32,] -1.322755e-26  1.885696e-28  4.065278e-27 -3.534148e-29 -2.334613e-30
              [,21]         [,22]         [,23]         [,24]         [,25]
 [1,] -1.714636e-17  5.376991e-16 -4.387607e-18 -1.301863e-15 -4.960145e-18
 [2,] -3.198053e-19  1.002889e-17 -8.183544e-20 -2.428169e-17 -9.251413e-20
 [3,]  4.814279e-18 -1.509727e-16  1.231933e-18  3.655312e-16  1.392687e-18
 [4,]  1.403026e-18 -4.399802e-17  3.590224e-19  1.065269e-16  4.058712e-19
 [5,] -9.173263e-30  2.871608e-28 -2.348371e-30 -6.964408e-28 -2.653065e-30
 [6,] -1.425734e-27  4.469184e-26 -3.652467e-28 -1.083149e-25 -4.123215e-28
 [7,] -4.071066e-31  1.271414e-29 -1.048597e-31 -3.123619e-29 -1.184226e-31
 [8,] -4.888149e-28  1.530368e-26 -1.251163e-28 -3.709895e-26 -1.414057e-28
 [9,] -8.294157e-30  2.601241e-28 -2.122035e-30 -6.299302e-28 -2.399223e-30
[10,] -5.219393e-28  1.636175e-26 -1.334892e-28 -3.962969e-26 -1.509617e-28
[11,] -2.448417e-30  7.678900e-29 -6.273263e-31 -1.860391e-28 -7.092318e-31
[12,] -2.715296e-29  8.526813e-28 -6.968121e-30 -2.067165e-27 -7.860769e-30
[13,] -8.156680e-30  2.557150e-28 -2.086441e-30 -6.194461e-28 -2.361060e-30
[14,]  6.348888e-28 -1.991044e-26  1.625147e-28  4.824245e-26  1.838016e-28
[15,] -3.385173e-30  1.059640e-28 -8.658198e-31 -2.566407e-28 -9.809302e-31
[16,] -7.892068e-28  2.473979e-26 -2.019410e-28 -5.995653e-26 -2.284162e-28
[17,]  1.126375e-29 -3.531960e-28  2.882281e-30  8.556451e-28  3.258008e-30
[18,]  2.426171e-28 -7.610004e-27  6.208005e-29  1.842681e-26  7.019494e-29
[19,] -2.109164e-30  6.612560e-29 -5.401491e-31 -1.599929e-28 -6.095654e-31
[20,] -1.389858e-31  4.355457e-30 -3.574260e-32 -1.049315e-29 -4.009216e-32
[21,]  3.000775e+00  4.622701e-02 -2.524242e-04 -5.269197e-05 -4.081249e-30
[22,]  4.622701e-02  4.901931e+00 -1.844299e-04  3.837226e-01  1.281974e-28
[23,] -2.524242e-04 -1.844299e-04  3.014184e+00  1.161340e-01 -1.043235e-30
[24,] -5.269197e-05  3.837226e-01  1.161340e-01  5.942804e+00 -3.102628e-28
[25,] -4.070633e-30  1.278815e-28 -1.042407e-30 -3.092275e-28  2.999343e+00
[26,] -2.145689e-30  6.730672e-29 -5.496246e-31 -1.630692e-28  1.724115e-05
[27,] -1.003712e-30  3.152358e-29 -2.570770e-31 -7.628787e-29 -1.268875e-04
[28,]  1.455736e-28 -4.569729e-27  3.724608e-29  1.104779e-26 -8.006293e-07
[29,] -2.239158e-29  7.020686e-28 -5.729307e-30 -1.699621e-27 -6.477242e-30
[30,]  1.113929e-27 -3.493398e-26  2.849724e-28  8.456973e-26  3.222314e-28
[31,] -3.253189e-31  1.018552e-29 -8.230944e-32 -2.435020e-29 -9.400160e-32
[32,] -2.363973e-28  7.411665e-27 -6.047633e-29 -1.794093e-26 -6.836539e-29
              [,26]         [,27]         [,28]         [,29]         [,30]
 [1,] -2.615794e-18 -1.220793e-18  1.766722e-16 -2.717455e-17  1.354155e-15
 [2,] -4.878847e-20 -2.276962e-20  3.295201e-18 -5.068461e-19  2.525701e-17
 [3,]  7.344509e-19  3.427688e-19 -4.960523e-17  7.629948e-18 -3.802134e-16
 [4,]  2.140412e-19  9.989320e-20 -1.445646e-17  2.223598e-18 -1.108057e-16
 [5,] -1.399217e-30 -6.533816e-31  9.442417e-29 -1.452376e-29  7.234597e-28
 [6,] -2.176055e-28 -1.015493e-28  1.469791e-26 -2.258727e-27  1.125846e-25
 [7,] -6.235504e-32 -2.917727e-32  4.182849e-30 -6.475834e-31  3.209560e-29
 [8,] -7.458400e-29 -3.482923e-29  5.036599e-27 -7.736962e-28  3.856019e-26
 [9,] -1.265453e-30 -5.906079e-31  8.546635e-29 -1.314503e-29  6.550740e-28
[10,] -7.962470e-29 -3.715629e-29  5.376670e-27 -8.267388e-28  4.121796e-26
[11,] -3.736761e-31 -1.744352e-31  2.522438e-29 -3.883730e-30  1.935860e-28
[12,] -4.154035e-30 -1.937265e-30  2.801318e-28 -4.305232e-29  2.149582e-27
[13,] -1.242664e-30 -5.803728e-31  8.400851e-29 -1.292685e-29  6.436040e-28
[14,]  9.685437e-29  4.524362e-29 -6.545388e-27  1.006683e-27 -5.011863e-26
[15,] -5.153579e-31 -2.406066e-31  3.466983e-29 -5.368104e-30  2.669007e-28
[16,] -1.203194e-28 -5.616714e-29  8.132418e-27 -1.250936e-27  6.226463e-26
[17,]  1.719645e-30  8.022379e-31 -1.161244e-28  1.783334e-29 -8.899888e-28
[18,]  3.703392e-29  1.727931e-29 -2.499147e-27  3.843942e-28 -1.915549e-26
[19,] -3.216222e-31 -1.500893e-31  2.173353e-29 -3.341103e-30  1.666085e-28
[20,] -2.110765e-32 -9.883946e-33  1.442538e-30 -2.189030e-31  1.098660e-29
[21,] -2.153084e-30 -1.004663e-30  1.453389e-28 -2.235903e-29  1.114015e-27
[22,]  6.757907e-29  3.156115e-29 -4.563774e-27  7.021324e-28 -3.498947e-26
[23,] -5.514464e-31 -2.570788e-31  3.725686e-29 -5.723155e-30  2.851729e-28
[24,] -1.636256e-28 -7.638727e-29  1.104973e-26 -1.700083e-27  8.467996e-26
[25,]  1.724115e-05 -1.268875e-04 -8.006293e-07 -6.462129e-30  3.217333e-28
[26,]  1.234472e-03  1.328437e-06  1.879911e-03 -3.406093e-30  1.696949e-28
[27,]  1.328437e-06  2.999704e+00  4.100950e-03 -1.597326e-30  7.928750e-29
[28,]  1.879911e-03  4.100950e-03  5.432680e+00  2.306310e-28 -1.150571e-26
[29,] -3.414886e-30 -1.594226e-30  2.307024e-28  2.999360e+00  2.699558e-02
[30,]  1.699256e-28  7.931423e-29 -1.147833e-26  2.699558e-02  5.512698e+00
[31,] -4.952154e-32 -2.298961e-32  3.315877e-30 -1.499973e-04 -5.887074e-05
[32,] -3.605049e-29 -1.683047e-29  2.435624e-27  7.643055e-06  8.543797e-02
              [,31]         [,32]
 [1,] -3.940157e-19 -2.871475e-16
 [2,] -7.348983e-21 -5.355731e-18
 [3,]  1.106300e-19  8.062398e-17
 [4,]  3.224092e-20  2.349627e-17
 [5,] -2.105860e-31 -1.534393e-28
 [6,] -3.276833e-29 -2.385465e-26
 [7,] -9.371464e-33 -6.793237e-30
 [8,] -1.122787e-29 -8.179275e-27
 [9,] -1.904373e-31 -1.388643e-28
[10,] -1.198734e-29 -8.734729e-27
[11,] -5.624947e-32 -4.099666e-29
[12,] -6.249267e-31 -4.553902e-28
[13,] -1.872066e-31 -1.365177e-28
[14,]  1.458805e-29  1.063522e-26
[15,] -7.752215e-32 -5.654198e-29
[16,] -1.812851e-29 -1.321544e-26
[17,]  2.589024e-31  1.886373e-28
[18,]  5.576570e-30  4.067083e-27
[19,] -4.847200e-32 -3.529292e-29
[20,] -3.175056e-33 -2.316977e-30
[21,] -3.241034e-31 -2.362629e-28
[22,]  1.017540e-29  7.417408e-27
[23,] -8.300678e-32 -6.039945e-29
[24,] -2.462713e-29 -1.795377e-26
[25,] -9.364939e-32 -6.831231e-29
[26,] -4.936470e-32 -3.598806e-29
[27,] -2.307516e-32 -1.685337e-29
[28,]  3.343869e-30  2.440024e-27
[29,] -1.499973e-04  7.643055e-06
[30,] -5.887074e-05  8.543797e-02
[31,]  3.155432e+00  3.093342e-01
[32,]  3.093342e-01  6.608765e+00

Model rank =  400 / 400 

Basis dimension (k) checking results. Low p-value (k-index<1) may
indicate that k is too low, especially if edf is close to k'.

                                              k'   edf k-index p-value    
te(knot,normtime):participant.dimvp004.x_z 24.00  9.00    0.97  <2e-16 ***
te(knot,normtime):participant.dimvp004.y_z 24.00 13.14    0.97   0.005 ** 
te(knot,normtime):participant.dimvp005.x_z 24.00 20.65    0.97   0.005 ** 
te(knot,normtime):participant.dimvp005.y_z 24.00 21.19    0.97   0.005 ** 
te(knot,normtime):participant.dimvp006.x_z 24.00 20.06    0.97  <2e-16 ***
te(knot,normtime):participant.dimvp006.y_z 24.00 18.52    0.97   0.005 ** 
te(knot,normtime):participant.dimvp008.x_z 24.00 21.97    0.97   0.015 *  
te(knot,normtime):participant.dimvp008.y_z 24.00 22.04    0.97  <2e-16 ***
te(knot,normtime):participant.dimvp009.x_z 24.00 11.55    0.97   0.005 ** 
te(knot,normtime):participant.dimvp009.y_z 24.00  9.05    0.97   0.005 ** 
te(knot,normtime):participant.dimvp010.x_z 24.00 22.11    0.97  <2e-16 ***
te(knot,normtime):participant.dimvp010.y_z 24.00 23.29    0.97   0.010 ** 
te(knot,normtime):participant.dimvp020.x_z 24.00  9.03    0.97  <2e-16 ***
te(knot,normtime):participant.dimvp020.y_z 24.00 20.01    0.97   0.005 ** 
te(knot,normtime):participant.dimvp023.x_z 24.00 21.85    0.97  <2e-16 ***
te(knot,normtime):participant.dimvp023.y_z 24.00 23.92    0.97   0.005 ** 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Code
acf_resid(GAM_dyn)

Now that we calculated our model, we plot the predictions of the tongue movement over time per participant.

Code
gam_dyn_pred <- get_predictions(GAM_dyn,
                        cond = list(knot = 0:10,
                                    normtime = c(0,0.25,0.5,0.75,1), # five time points
                                    participant.dim = rhotic_data_long$participant.dim %>% levels()),
                        se = FALSE,
                        print.summary = FALSE) %>% 
  separate_wider_delim(participant.dim, ".", names = c("participant", "dimension")) %>% 
  pivot_wider(names_from = dimension, values_from = "fit") %>% 
  mutate(part.normtime = paste0(participant, ".", normtime)) # for plotting

gam_dyn_pred_plot <- ggplot(gam_dyn_pred) +
  aes(x_z, y_z, group = part.normtime, color = normtime) +
  geom_path() +
  # geom_point(aes(color = participant)) +
  geom_text(aes(label = knot), check_overlap = TRUE,
            nudge_x = .05, nudge_y = .05, show.legend = FALSE) +
  facet_wrap(vars(participant), ncol = 4) +
  theme(legend.position = "bottom")
Code
gam_dyn_pred_plot
Figure 32: GAM prediction of dynamic polar data

We can observe great differences in the tongue movement of the participants. Some participants barely show any tongue movement during the articulation of the rhotic (participant 4, 5, 6, 9), while the range of movement of other participants is rather big. It is interesting to see that the tongue movement of the participants seemingly using front rhotics (participant 8, 10 and 23) is larger in general. But there are also individual differences within this group – while participant 8 seems to mainly move their tongue upward, the predicted contours of participant 10 and 23 look like they first move upward, then move downward. If we tried to interpret these results, we could say that the predictions of participant 10 and 23 look more like taps. The only person whose tongue contours look like they produce a back/uvular rhotic and has a lot of tongue movement is participant 20. Their tongue tip clearly moves downward as time progresses.

We chose to display the predictions using five time points (at 0 %, 25 %, 50 %, 75 % and 100 % of the time) in the plot above. Of course we can also change that and display more time points to get a higher resolution on the time axis:

Code
gam_dyn_pred_f <- get_predictions(GAM_dyn,
                        cond = list(knot = 0:10,
                                    normtime = (0:10)/10,
                                    participant.dim = rhotic_data_long$participant.dim %>% levels()),
                        se = FALSE,
                        print.summary = FALSE) %>% 
  separate_wider_delim(participant.dim, ".", names = c("participant", "dimension")) %>% 
  pivot_wider(names_from = dimension, values_from = "fit") %>% 
  mutate(part.normtime = paste0(participant, ".", normtime)) # for plotting

gam_dyn_pred_plot_f <- ggplot(gam_dyn_pred_f) +
  aes(x_z, y_z, group = part.normtime, color = normtime) +
  geom_path() +
  facet_wrap(vars(participant), ncol = 4) +
  theme(legend.position = "bottom")
Code
gam_dyn_pred_plot_f
Figure 33: GAM prediction of dynamic polar data; higher resolution on time axis

2.2 (M)FPCA

2.2.1 Polar coordinates

We will do it similar as in Section 1.2.1. We will create a funData object, this time with two dimensions instead of one. Again, we want to predict radius by angle * time. We use the Procrustean normalized data from Section 2.1.1.

Since we now need fixed normtime-points that are the same for every tongue contour, we need to interpolate the data set with regard to normtime. We take the 6 time points that we also took for displaying the GAM predictions (e.g., in Figure 28 (0, 0.2, 0.4, 0.6, 0.8, 1.0)) to create data points.

Code
proc_polar_curves_norm <- proc_polar_curves %>% 
  group_by(participant, clip_id, angle) %>% 
  # interpolation
  reframe(bind_cols(
    approx(normtime, radius, normtime_grid, rule = 2) %>% as_tibble()
    )) %>% 
  rename(normtime = x,radius = y)

Now we create the funData object:

Code
# build a funData object

argvals <- list(normtime=normtime_grid, angle=angle_grid)

# X: array of dim c(nFrames, length(normtime_grid), length(angle_grid))
X <- proc_polar_curves_norm %>%
  pivot_wider(id_cols = c(clip_id, normtime), names_from = angle, values_from = radius) %>%
  select(-normtime) %>%
  nest(.by = clip_id) %>%
  pull(data) %>%
  lapply(as.matrix) %>%
  abind(along = 0) # combine multi-dimensional arrays

fpca_polar_dyn_curves <- funData(argvals, X)
fpca_polar_dyn_curves
Functional data with 2170 observations of 2-dimensional support
argvals:
    0 0.2 ... 1     (6 sampling points)
    1.131963 1.262452 ... 2.436858      (11 sampling points)
X:
    array of size 2170 x 6 x 11 

We’ve now got a funData object with 6 sampling points in regard to normtime and 11 sampling points in regard to angle (= our fake knots).

Calculate the FPCA:

Code
# FCP_TPA
fpca_polar_dyn <- MFPCA(fpca_polar_dyn_curves %>% multiFunData(),
              M = 5, # number of principle components to calculate
              # FCP_TPA is one of the possible multidim. support parametrizations
              # alphaRange = "A list of length 2 with entries v and w, containing the range of smoothness parameters to test for each direction" (source: MFPCA documentation)
              uniExpansions = list(list(type = "FCP_TPA",
                                        npc = 5,
                                        alphaRange = list(v = c(10^-2, 10^2), w = c(10^-3, 10^3)))) 
)
Code
# Prop of explained var
PropExpVar <- round(fpca_polar_dyn$values/sum(fpca_polar_dyn$values), digits = 3)
PropExpVar
[1] 0.535 0.246 0.111 0.076 0.032

First PC explains 53 % of the data, second PC explains 25 % of the data, third 11 %.

Code
nPC <- 2 # we only look at the first 2 PCs from now on

# scores st. dev.
sdFun <- fpca_polar_dyn$values %>% sqrt()

# PC curves to be plotted
PCcurves_polar_dyn <- expand_grid(PC = 1:nPC,
                        fractionOfStDev = c(-1, 0, 1)) %>%
  group_by(PC, fractionOfStDev) %>%
  reframe(
    (fpca_polar_dyn$meanFunction[[1]]@X[1,,] + fractionOfStDev * sdFun[PC] * fpca_polar_dyn$functions[[1]]@X[PC,,]) %>% 
      `colnames<-`(angle_grid) %>% 
      as_tibble() %>% 
      bind_cols(normtime = normtime_grid)
  ) %>% 
  pivot_longer(cols = -c(PC, fractionOfStDev, normtime), names_to = "angle", values_to = "radius") %>% 
  mutate(angle = as.numeric(angle))
  
# Plot
FPCA_polar_dyn_curves_plot <- ggplot(PCcurves_polar_dyn %>% rename(`score/std` = fractionOfStDev)) +
  aes(x = angle, y = radius, group = normtime, color = normtime) +
  geom_path() +
  facet_grid(PC ~ `score/std`, 
             labeller = labeller(PC = ~ str_glue("PC{.x}"), `score/std` = label_both),
             ) +
  radial_plot(5) +
  theme(legend.position = "bottom")
Code
FPCA_polar_dyn_curves_plot
Figure 34: FPCA: 2 Principle Components for Polar dynamic data

The PCs again capture the whole variation of tongue movement over time of the entire dataset. The top row shows the predictions using the PC1 score, the bottom row the ones using the PC2 score. On the middle columns, the mean tongue movement is pictured. On the left, the deviation from the mean of -1 standard deviation of the PC scores is pictured, on the right, +1 standard deviation.

The mean function is a tongue shape with an arched tongue dorsum and a tongue tip that is rather neutral (it’s not exactly pointing down but it’s also not pointing upward as well). There is hardly any movement, if one looks closely, it becomes evident that the tongue moves upward slightly as a whole. PC1 seems to describe a variation of the movement in the tongue tip region – on the -1 sd plot, the tongue tip moves upward as time progresses, on the +1 sd side, the tongue tip moves very slightly downward. PC2 captures a variation of the whole tongue moving either upward (-1 sd) or downward (+1 sd).

Now we want to see if there is a difference in the PC scores between the participants:

Code
# collect PC scores
PCscores_polar_dyn <- fpca_polar_dyn$scores[, 1:nPC] %>%
  `colnames<-`(paste0("s", 1:nPC)) %>%
  as_tibble() %>%
  bind_cols(proc_polar_curves_norm %>% distinct(clip_id, participant), .)

# boxplots PC scores by participant
FPCA_polar_dyn_boxplot <- PCscores_polar_dyn %>% 
  pivot_longer(cols = paste0("s", 1:nPC), 
               names_to = "PC", values_to = "score") %>% 
  filter(PC %in% str_c("s", 1:nPC)) %>% 
  ggplot() +
  aes(x = participant, y = score, color = participant) +
  geom_boxplot() +
  facet_wrap(~ PC) +
  theme(legend.position = "bottom", axis.text.x = element_text(angle = 45, vjust = 0.5, hjust=1))
Code
FPCA_polar_dyn_boxplot
Figure 35: FPCA: distribution of PC scores by participant

There are differences between the participants in both of the PC scores. It seems like they are both correlated to the variable participant. Again, we can calculate linear models to confirm that the scores are correlated to participant and plot the predicted tongue movement of the participants made by the linear models.

Code
mod_s1 <- lm(s1 ~ participant, data = PCscores_polar_dyn)
mod_s1 %>% summary()

Call:
lm(formula = s1 ~ participant, data = PCscores_polar_dyn)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.61797 -0.07249 -0.00205  0.07222  0.41471 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)      -0.11245    0.00911 -12.344  < 2e-16 ***
participantvp005  0.02244    0.01113   2.016  0.04393 *  
participantvp006  0.12865    0.01111  11.574  < 2e-16 ***
participantvp008  0.38265    0.01123  34.066  < 2e-16 ***
participantvp009 -0.17846    0.01284 -13.897  < 2e-16 ***
participantvp010  0.22324    0.01111  20.084  < 2e-16 ***
participantvp020  0.03431    0.01109   3.093  0.00201 ** 
participantvp023  0.09905    0.01110   8.920  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1127 on 2162 degrees of freedom
Multiple R-squared:  0.6208,    Adjusted R-squared:  0.6196 
F-statistic: 505.7 on 7 and 2162 DF,  p-value: < 2.2e-16
Code
mod_s2 <- lm(s2 ~ participant, data = PCscores_polar_dyn)
mod_s2 %>% summary()

Call:
lm(formula = s2 ~ participant, data = PCscores_polar_dyn)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.284112 -0.055889  0.001706  0.056075  0.243172 

Coefficients:
                  Estimate Std. Error t value Pr(>|t|)    
(Intercept)       0.071273   0.006796  10.487  < 2e-16 ***
participantvp005 -0.144580   0.008306 -17.408  < 2e-16 ***
participantvp006 -0.200953   0.008292 -24.234  < 2e-16 ***
participantvp008  0.025316   0.008380   3.021 0.002548 ** 
participantvp009 -0.016356   0.009580  -1.707 0.087901 .  
participantvp010 -0.023397   0.008292  -2.822 0.004823 ** 
participantvp020 -0.170687   0.008275 -20.626  < 2e-16 ***
participantvp023  0.030411   0.008284   3.671 0.000247 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.08406 on 2162 degrees of freedom
Multiple R-squared:  0.5412,    Adjusted R-squared:  0.5398 
F-statistic: 364.4 on 7 and 2162 DF,  p-value: < 2.2e-16

Plot prediction PC1:

Code
predCurves_s1 <- emmeans(mod_s1, pairwise ~ participant)$emmeans %>% 
  as_tibble() %>% 
  group_by(participant) %>% 
reframe(
    (fpca_polar_dyn$meanFunction[[1]]@X[1,,] + emmean * fpca_polar_dyn$functions[[1]]@X[s,,]) %>% 
      `colnames<-`(angle_grid) %>% 
      as_tibble() %>% 
      bind_cols(normtime = normtime_grid)
  ) %>% 
  pivot_longer(cols = -c(participant, normtime), names_to = "angle", values_to = "radius") %>% 
  mutate(angle = as.numeric(angle))

FPCA_polar_dyn_pred_s1_plot <- ggplot(predCurves_s1) +
  aes(x = angle, y = radius, group = normtime, color = normtime) +
  geom_path() +
  facet_wrap(vars(participant), ncol = 3) +
  radial_plot(5) +
  theme(legend.position = "bottom")
Code
FPCA_polar_dyn_pred_s1_plot
Figure 36: FPCA: Predicted tongue contours by PC1 score

As already expected, we observe a lot of variation in the tongue tip area of the participants. What’s interesting this time are the differences in the tongue tip movements. While the tongue tip of participant 6, 8 and 10 show an upward movement, the tongue tip of participant 9 moves downward. Participants 4, 5 and 20 show barely any movement at all. Participant 23’s tongue tip first moves upward then downward again.

Code
predCurves_s2 <- emmeans(mod_s2, pairwise ~ participant)$emmeans %>% 
  as_tibble() %>% 
  group_by(participant) %>% 
reframe(
    (fpca_polar_dyn$meanFunction[[1]]@X[1,,] + emmean * fpca_polar_dyn$functions[[1]]@X[s,,]) %>% 
      `colnames<-`(angle_grid) %>% 
      as_tibble() %>% 
      bind_cols(normtime = normtime_grid)
  ) %>% 
  pivot_longer(cols = -c(participant, normtime), names_to = "angle", values_to = "radius") %>% 
  mutate(angle = as.numeric(angle))

FPCA_polar_dyn__pred_s2_plot <- ggplot(predCurves_s2) +
  aes(x = angle, y = radius, group = normtime, color = normtime) +
  geom_path() +
  facet_wrap(vars(participant), ncol = 3) +
  radial_plot(5) +
  theme(legend.position = "bottom")
Code
FPCA_polar_dyn__pred_s2_plot
Figure 37: FPCA: Predicted tongue contours by PC2 score

s2 does not show many differences between the participants’ tongue contour movements but more general tongue shape differences in the tongue tip area (e.g., tongue tip of participants 8 and 23 being slightly higher than most others).

2.2.2 Cartesian coordinates

Now we will calculate a multivariate fpca model that predicts both x and y values by the predictors normtime and knot.

We’re doing the same as in Section 2.2.1 and interpolate on the normtime dimension so every clip has the same number of timepoints/frames:

Code
# i'm pretty sure this is not the best way to do this ... but it works
cartesian_curves_norm <- rhotic_data %>% 
  group_by(participant, clip_id, knot) %>% 
  # interpolation
  reframe(bind_cols(
    approx(normtime, x_z, normtime_grid, rule = 2) %>% as_tibble() # interpolate x_z by normtime_grid
    )) %>% 
  rename(normtime = x,x_z = y) %>% 
  merge( # merge interpolated x_z and interpolated y_z values
    rhotic_data %>% 
  group_by(participant, clip_id, knot) %>% 
  # interpolation
  reframe(bind_cols(
    approx(normtime, y_z, normtime_grid, rule = 2) %>% as_tibble() # interpolate y_z by normtime_grid
    )) %>% 
  rename(normtime = x, y_z = y), by = c("participant", "clip_id", "knot", "normtime")
  ) %>% mutate(knot = as.integer(knot)) %>% arrange(knot)

Build a funData object:

Code
argvals <- list(normtime=normtime_grid, knot=0:10)
# curvesFun is a multiFunData with two response outcomes (x, y) -> predicted variables
# each of which is an array of dim c(nClips, length(normtime_grid), number of knots)
cartesian_curves_dyn <- multiFunData( 
                     lapply(c('x_z', 'y_z'), function(dim) {
                       funData(argvals,
                       cartesian_curves_norm %>%
                         pivot_wider(id_cols = c(clip_id, normtime), names_from = knot, values_from = {{dim}}) %>% 
                         select(-normtime) %>% 
                         nest(.by = clip_id) %>% 
                         pull(data) %>% 
                         lapply(as.matrix) %>% 
                         abind(along = 0)
                     )}))
cartesian_curves_dyn
An object of class "multiFunData"
[[1]]
Functional data with 2170 observations of 2-dimensional support
argvals:
    0 0.2 ... 1     (6 sampling points)
    0 1 ... 10      (11 sampling points)
X:
    array of size 2170 x 6 x 11 

[[2]]
Functional data with 2170 observations of 2-dimensional support
argvals:
    0 0.2 ... 1     (6 sampling points)
    0 1 ... 10      (11 sampling points)
X:
    array of size 2170 x 6 x 11 

Same procedure as in Section 2.2.1:

Code
# FCP_TPA
# the same base expansion applied to x and y 
uniExpansion <- list(type = "FCP_TPA",
                                        npc = 5,
                                        alphaRange = list(v = c(10^-2, 10^2), w = c(10^-3, 10^3)))
fpca_cartesian_dyn <- MFPCA(cartesian_curves_dyn,
              M = 5,
              uniExpansions = list(uniExpansion, uniExpansion)
)

# Prop of explained var
PropExpVar <- round(fpca_cartesian_dyn$values/sum(fpca_cartesian_dyn$values), digits = 3)
Code
PropExpVar
[1] 0.536 0.251 0.108 0.069 0.036

PC1 explains 54 % of the variation in the data, PC2 25 % and PC3 11 %.

Code
nPC <- 2 # working with two PCs again
# scores st. dev.
sdFun <- fpca_cartesian_dyn$values %>% sqrt()

# PC curves to be plotted
PCcurves_cartesian_dyn <- expand_grid(PC = 1:nPC,
                        fractionOfStDev = c(-1, 0, 1),
                        dim = 1:2) %>%
  group_by(PC, fractionOfStDev, dim) %>%
  reframe(
    (fpca_cartesian_dyn$meanFunction[[dim]]@X[1,,] + fractionOfStDev * sdFun[PC] * fpca_cartesian_dyn$functions[[dim]]@X[PC,,]) %>% 
      `colnames<-`(0:10) %>% 
      as_tibble() %>% 
      bind_cols(normtime = normtime_grid)
  ) %>% 
  pivot_longer(cols = -c(PC, fractionOfStDev, normtime, dim), names_to = "knot", values_to = "pos") %>% 
  mutate(knot = as.integer(knot),
         dim = factor(dim, levels = 1:2, labels = c('x_z','y_z'))) %>% 
  pivot_wider(names_from = dim, values_from = pos)
  
# Plot
FPCA_cartesian_dyn_plot <- ggplot(PCcurves_cartesian_dyn %>% rename(`score/std` = fractionOfStDev)) +
  aes(x = x_z, y = y_z, group = normtime, color = normtime) +
  geom_path() +
  facet_grid(PC ~ `score/std`, 
             labeller = labeller(PC = ~ str_glue("PC{.x}"), `score/std` = label_both),
             ) +
  theme(legend.position = "bottom")

# collect PC scores
PCscores_cartesian_dyn <- fpca_cartesian_dyn$scores[, 1:nPC] %>%
  `colnames<-`(paste0("s", 1:nPC)) %>%
  as_tibble() %>%
  bind_cols(cartesian_curves_norm %>% distinct(clip_id, participant), .)

# boxplots PC scores by participant
FPCA_cartesian_dyn_scores_plot <- PCscores_cartesian_dyn %>% 
  pivot_longer(cols = paste0("s", 1:nPC), 
               names_to = "PC", values_to = "score") %>% 
  filter(PC %in% str_c("s", 1:nPC)) %>% 
  ggplot() +
  aes(x = participant, y = score, color = participant) +
  geom_boxplot() +
  facet_wrap(~ PC) +
  theme(legend.position = "bottom")

Plot:

Code
FPCA_cartesian_dyn_plot
Figure 38: FPCA: 2 Principle Components for Cartesian dynamic data

The plot is structured similarly to Figure 34. Instead of angle/radius values, x/y values are plotted. It is striking that both PCs show a lot of variation in regard to the tongue shape. A small PC1 score shows a tongue shape with an upward-pointing tongue tip. Regarding the time axis, the tongue tip moves upward as time progresses. A large PC1 score pictures a downward-pointing tongue tip. In the time axis, the tongue tip also moves downward over time. A small PC2 score pictures a tongue contour with a downward-pointing tongue tip and a backward tongue movement in the tongue dorsum area. A large PC2 score pictures barely any tongue movement but an overall raised tongue (higher y-value).

Code
FPCA_cartesian_dyn_scores_plot
Figure 39: FPCA: distribution of PC scores by participant

When comparing the PC scores of the participants, it becomes evident that s1 differs while s2 doesn’t. We can deduce that s1 probably correlates with the variable participant while s2 does not.

We can test the correlation of s1 and participant using a linear model again:

Code
mod_s1 <- lm(s1 ~ participant, data = PCscores_cartesian_dyn)
mod_s1 %>% summary()

Call:
lm(formula = s1 ~ participant, data = PCscores_cartesian_dyn)

Residuals:
     Min       1Q   Median       3Q      Max 
-1.34487 -0.15125  0.01053  0.16063  0.83071 

Coefficients:
                 Estimate Std. Error t value Pr(>|t|)    
(Intercept)       0.15285    0.01862   8.208 3.83e-16 ***
participantvp005  0.27537    0.02276  12.100  < 2e-16 ***
participantvp006 -0.11211    0.02272  -4.934 8.67e-07 ***
participantvp008 -1.14904    0.02296 -50.039  < 2e-16 ***
participantvp009  0.80486    0.02625  30.660  < 2e-16 ***
participantvp010 -0.57161    0.02272 -25.156  < 2e-16 ***
participantvp020  0.14757    0.02268   6.508 9.44e-11 ***
participantvp023 -0.11667    0.02270  -5.140 3.00e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2303 on 2162 degrees of freedom
Multiple R-squared:  0.8346,    Adjusted R-squared:  0.834 
F-statistic:  1558 on 7 and 2162 DF,  p-value: < 2.2e-16

Pairwise comparisons of participants using emmeans:

Code
emmeans(mod_s1, pairwise ~ participant)
$emmeans
 participant  emmean     SE   df lower.CL upper.CL
 vp004        0.1528 0.0186 2162   0.1163   0.1894
 vp005        0.4282 0.0131 2162   0.4026   0.4539
 vp006        0.0407 0.0130 2162   0.0152   0.0663
 vp008       -0.9962 0.0134 2162  -1.0225  -0.9698
 vp009        0.9577 0.0185 2162   0.9214   0.9940
 vp010       -0.4188 0.0130 2162  -0.4443  -0.3932
 vp020        0.3004 0.0129 2162   0.2750   0.3258
 vp023        0.0362 0.0130 2162   0.0107   0.0616

Confidence level used: 0.95 

$contrasts
 contrast      estimate     SE   df t.ratio p.value
 vp004 - vp005 -0.27537 0.0228 2162 -12.100  <.0001
 vp004 - vp006  0.11211 0.0227 2162   4.934  <.0001
 vp004 - vp008  1.14904 0.0230 2162  50.039  <.0001
 vp004 - vp009 -0.80486 0.0263 2162 -30.660  <.0001
 vp004 - vp010  0.57161 0.0227 2162  25.156  <.0001
 vp004 - vp020 -0.14757 0.0227 2162  -6.508  <.0001
 vp004 - vp023  0.11667 0.0227 2162   5.140  <.0001
 vp005 - vp006  0.38749 0.0185 2162  20.993  <.0001
 vp005 - vp008  1.42441 0.0188 2162  75.960  <.0001
 vp005 - vp009 -0.52948 0.0227 2162 -23.366  <.0001
 vp005 - vp010  0.84699 0.0185 2162  45.888  <.0001
 vp005 - vp020  0.12780 0.0184 2162   6.946  <.0001
 vp005 - vp023  0.39204 0.0184 2162  21.274  <.0001
 vp006 - vp008  1.03692 0.0187 2162  55.426  <.0001
 vp006 - vp009 -0.91697 0.0226 2162 -40.531  <.0001
 vp006 - vp010  0.45950 0.0184 2162  24.955  <.0001
 vp006 - vp020 -0.25969 0.0184 2162 -14.148  <.0001
 vp006 - vp023  0.00456 0.0184 2162   0.248  1.0000
 vp008 - vp009 -1.95389 0.0229 2162 -85.454  <.0001
 vp008 - vp010 -0.57742 0.0187 2162 -30.864  <.0001
 vp008 - vp020 -1.29661 0.0187 2162 -69.519  <.0001
 vp008 - vp023 -1.03237 0.0187 2162 -55.267  <.0001
 vp009 - vp010  1.37647 0.0226 2162  60.841  <.0001
 vp009 - vp020  0.65728 0.0226 2162  29.113  <.0001
 vp009 - vp023  0.92153 0.0226 2162  40.775  <.0001
 vp010 - vp020 -0.71919 0.0184 2162 -39.182  <.0001
 vp010 - vp023 -0.45494 0.0184 2162 -24.747  <.0001
 vp020 - vp023  0.26424 0.0183 2162  14.419  <.0001

P value adjustment: tukey method for comparing a family of 8 estimates 
Code
predCurves_cartesian_dyn <- emmeans(mod_s1, pairwise ~ participant)$emmeans %>% 
  as_tibble() %>% 
  expand_grid(dim = 1:2) %>% 
  group_by(participant, dim) %>% 
reframe(
    (fpca_cartesian_dyn$meanFunction[[dim]]@X[1,,] + emmean * fpca_cartesian_dyn$functions[[dim]]@X[s,,]) %>% 
      `colnames<-`(0:10) %>% # knots
      as_tibble() %>% 
      bind_cols(normtime = normtime_grid)
  ) %>% 
  pivot_longer(cols = -c(participant, normtime, dim), names_to = "knot", values_to = "pos") %>% 
  mutate(knot = as.integer(knot),
         dim = factor(dim, levels = 1:2, labels = c('x_z','y_z'))) %>% 
  pivot_wider(names_from = dim, values_from = pos)

FPCA_cartesian_dyn_pred_plot <- ggplot(predCurves_cartesian_dyn) +
  aes(x = x_z, y = y_z, group = normtime, color = normtime) +
  geom_path() +
  facet_grid(~ participant) +
  theme(legend.position = "bottom")
Code
FPCA_cartesian_dyn_pred_plot
Figure 40: FPCA: Predicted tongue contours by PC1 score

The predicted tongue contours differ quite a lot. While the tongue tip of most participants curve downward, the contour of participant 8 bends clearly upward. For participant 8, there is also an upward movement visible in the time axis. The same is true for participant 10. For both of these participants the tongue also moves back downward at the very end of the normtime. Particularly interesting is the fact that this is not the case for participant 23. Their predicted tongue contour shows a downward-curved tongue tip although most other predictions, e.g., Figure 32, picture participant 23 with an upward-pointing tongue tip. Here, the variation concerning participant 23 might not be covered by PC1 but maybe by another PC-score.

3 References

Coretta, Stefano, and Georges Sakr. 2025. “Stefanocoretta/Mv_uti: V1.1.” Zenodo. https://doi.org/10.5281/zenodo.15335918.
Gubian, Michele. 2025. “Analysis of Uni- and Multi-Dimensional Contours with GAMs and Functional PCA: An Application to Ultrasound Tongue Imaging.” https://github.com/uasolo/FPCA-phonetics-workshop/blob/master/presentations/UTI/UTI_analysis_code.html.
Gubian, Michele, Francisco Torreira, and Lou Boves. 2015. “Using Functional Data Analysis for Investigating Multidimensional Dynamic Phonetic Contrasts.” Journal of Phonetics 49: 16–40. https://doi.org/10.1016/j.wocn.2014.10.001.
Wieling, Martijn. 2018. “Analyzing Dynamic Phonetic Data Using Generalized Additive Mixed Modeling: A Tutorial Focusing on Articulatory Differences Between L1 and L2 Speakers of English.” Journal of Phonetics 70: 86–116. https://doi.org/10.1016/j.wocn.2018.03.002.